**Fundamental Theories of Physics 188**

## Klaas Landsman

# Foundations of Quantum Theory

From Classical Concepts to Operator Algebras

## Fundamental Theories of Physics

## Volume 188

#### Series editors

Henk van Beijeren, Utrecht, The Netherlands Philippe Blanchard, Bielefeld, Germany Paul Busch, York, UK Bob Coecke, Oxford, UK Dennis Dieks, Utrecht, The Netherlands Bianca Dittrich, Waterloo, Canada Detlef Dürr, München, Germany Ruth Durrer, Genève, Switzerland Roman Frigg, London, UK Christopher Fuchs, Boston, USA Giancarlo Ghirardi, Trieste, Italy Domenico J.W. Giulini, Bremen, Germany Gregg Jaeger, Boston, USA Claus Kiefer, Köln, Germany Nicolaas P. Landsman, Nijmegen, The Netherlands Christian Maes, Leuven, Belgium Mio Murao, Bunkyo-ku, Japan Hermann Nicolai, Potsdam, Germany Vesselin Petkov, Montreal, Canada Laura Ruetsche, Ann Arbor, USA Mairi Sakellariadou, London, UK Alwyn van der Merwe, Denver, USA Rainer Verch, Leipzig, Germany Reinhard Werner, Hannover, Germany Christian Wüthrich, Geneva, Switzerland Lai-Sang Young, New York City, USA

The international monograph series "Fundamental Theories of Physics" aims to stretch the boundaries of mainstream physics by clarifying and developing the theoretical and conceptual framework of physics and by applying it to a wide range of interdisciplinary scientific fields. Original contributions in well-established fields such as Quantum Physics, Relativity Theory, Cosmology, Quantum Field Theory, Statistical Mechanics and Nonlinear Dynamics are welcome. The series also provides a forum for non-conventional approaches to these fields. Publications should present new and promising ideas, with prospects for their further development, and carefully show how they connect to conventional views of the topic. Although the aim of this series is to go beyond established mainstream physics, a high profile and open-minded Editorial Board will evaluate all contributions carefully to ensure a high scientific standard.

More information about this series at http://www.springer.com/series/6001

Klaas Landsman

# Foundations of Quantum Theory

From Classical Concepts to Operator Algebras

Klaas Landsman IMAPP Radboud University Nijmegen The Netherlands

ISSN 0168-1222 ISSN 2365-6425 (electronic) Fundamental Theories of Physics ISBN 978-3-319-51776-6 ISBN 978-3-319-51777-3 (eBook) DOI 10.1007/978-3-319-51777-3

Library of Congress Control Number: 2017933673

© The Author(s) 2017. This book is an open access publication.

Open Access This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Printed on acid-free paper

This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland *To Jeremy Butterfield*

## Preface

'Der Kopf, *so* gesehen, hat mit dem Kopf, *so* gesehen, auch nicht die leiseste Ahnlichkeit ¨ (. . . ) Der Aspektwechsel. "Du wurdest doch sagen, dass sich das Bild jetzt g ¨ anzlich ¨ geandert hat!" Aber was ist anders: mein Eindruck? meine Stellungnahme? ( ¨ . . . ) Ich *beschreibe* die Anderung wie eine Wahrnehmung, ganz, als h ¨ atte sich der Gegenstand vor ¨ meinen Augen geandert.' (Wittgenstein, ¨ *Philosophische Untersuchungen* II, §§127, 129).1

As the well-known picture above is meant to allegorize, some physical systems admit a dual description in either classical or quantum-mechanical terms. According to Bohr's "doctrine of classical concepts", measurement apparatuses are examples of such systems. More generally—as hammered down by decoherence theorists the classical world around us is a case in point. As will be argued in this book, the measurement problem of quantum mechanics (highlighted by Schrodinger's Cat) is ¨ *caused* by this duality (rather than *resolved* by it, as Bohr is said to have thought).

<sup>1</sup> 'The head seen in *this* way hasn't even the slightest similarity to the head seen in *that* way (. . . ) The change of aspect. "But surely you'd say that the picture has changed altogether now! But what is different: my impression? my attitude? (...)I *describe* the change like a perception; just as if the object has changed before my eyes.' Translation: G.E.M. Anscombe, P.M.S. Hacker, & J. Schulte (Wittgenstein, 2009/1953, pp. 205–206).

The aim of this book is to analyze the foundations of quantum theory from the point of view of classical-quantum duality, using the mathematical formalism of operator algebras on Hilbert space (and, more generally, C\*-algebras) that was originally created by von Neumann (followed by Gelfand and Naimark). In support of this analysis, but also as a matter of independent interest, the book covers many of the traditional topics one might expect to find in a treatise on the foundations of quantum mechanics, like pure and mixed states, observables, the Born rule and its relation to both single-case probabilities and long-run frequencies, Gleason's Theorem, the theory of symmetry (including Wigner's Theorem and its relatives, culminating in a recent theorem of Hamhalter's), Bell's Theorem(s) and the like, quantization theory, indistinguishable particle, large systems, spontaneous symmetry breaking, the measurement problem, and (intuitionistic) quantum logic. One also finds a few idiosyncratic themes, such as the Kadison–Singer Conjecture, topos theory (which naturally injects intuitionism into quantum logic), and an unusual emphasis on both conceptual and mathematical aspects of limits in physical theories.

All of this is held together by what we call *Bohrification*, i.e., the mathematical interpretation of Bohr's classical concepts by *commutative* C\*-algebras, which in turn are studied in their quantum habitat of *noncommutative* C\*-algebras.

Thus the book is mostly written in mathematical physics style, but its real subject is *natural philosophy*. Hence its intended readership consists not only of mathematical physicists, but also of philosophers of physics, as well as of theoretical physicists who wish to do more than 'shut up and calculate', and finally of mathematicians who are interested in the mathematical and conceptual structure of quantum theory.

To serve all these groups, the native mathematical language (i.e. of C\*-algebras) is introduced slowly, starting with finite sets (as classical phase spaces) and finitedimensional Hilbert spaces. In addition, all advanced mathematical background that is necessary but may distract from the main development is laid out in extensive appendices on Hilbert spaces, functional analysis, operator algebras, lattices and logic, and category theory and topos theory, so that the prerequisites for this book are limited to basic analysis and linear algebra (as well as some physics). These appendices not only provide a direct route to material that otherwise most readers would have needed to extract from thousands of pages of diverse textbooks, but they also contain some original material, and may be of interest even to mathematicians.

In summary, the aims of this book are similar to those of its peerless paradigm:

'Der Gegenstand dieses Buches ist die einheitliche, und, soweit als moglich und angebracht, ¨ mathematisch einwandfreie Darstellung der neuen Quantenmechanik (. . . ). Dabei soll das Hauptgewicht auf die allgemeinen und prinzipiellen Fragen, die im Zusammenhange mit dieser Theorie entstanden sind, gelegt werden. Insbesondere sollen die schwierigen und vielfach noch immer nicht restlos geklarten Interpretationsfragen n ¨ aher untersucht werden.' ¨ (von Neumann, *Mathematische Grundlagen der Quantenmechanik*, 1932, p. 1).2

<sup>2</sup> 'The object of this book is to present the new quantum mechanics in a unified presentation which, so far as it is possible and useful, is mathematically rigorous. (...) Therefore the principal emphasis shall be placed on the general and fundamental questions which have arisen in connection with this theory. In particular, the difficult problems with interpretation, many of which are even now not fully resolved, will be investigated in detail.' Translation: R.T. Beyer (von Neumann, 1955, p. vii).

Two other quotations the author often had in mind while writing this book are:

'And although the whole of philosophy is not immediately evident, still it is better to add something to our knowledge day by day than to fill up men's minds in advance with the preconceptions of hypotheses.' (Newton, draft preface to *Principia*, 1686).3

'Juist het feit dat een genie als DESCARTES volkomen naast de lijn van ontwikkeling is blijven staan, die van GALILEI naar NEWTON voert (...) [is] een phase van den in de historie zoo vaak herhaalden strijd tusschen de bescheidenheid der mathematisch-physische methode, die na nauwkeurig onderzoek de verschijnselen der natuur in steeds meer omvattende schemata met behulp van de exacte taal der mathesis wil beschrijven en den hoogmoed van het philosophische denken, dat in e´en genialen greep de heele wereld wil omvatten ( ´ . . . ).' (Dijksterhuis, *Val en Worp*, 1924, p. 343).4

#### Acknowledgements

	- Radboud University Nijmegen, partly through a sabbatical in 2014.
	- The Netherlands Organization for Scientific Research (NWO), initially by funding various projects eventually contributing to this book, and most recently by paying the Open Access fee, making the book widely available.
	- The Templeton World Charity Foundation (TWCF), by funding the Oxford– Princeton–Nijmegen collaboration *Experimental Tests of Quantum Reality*.
	- Trinity College (Cambridge), by appointing the author as a *Visiting Fellow Commoner* during the Easter Term 2016, when the book was largely finished.

Finally, it is a pleasure to dedicate this book to Jeremy Butterfield, in recognition of his ideas, as well as of his unrelenting support and friendship over the last 25 years.

<sup>3</sup> Newton (1999), p. 61.

<sup>4</sup> 'The very fact that a genius like Descartes was completely sidelined in the development leading from Galilei to Newton (...) represents a phase in the struggle—that has so often been repeated throughout history—between the modesty of the approach of mathematical physics, which after precise investigations attempts to describe natural phenomena in increasingly comprehensive schemes using the exact language of mathematics, and the haughtiness of philosophical thought, which wants to comprehend the entire world in one dazzling grasp.' Translation by the author.

## Contents


#### Part I *C*0(*X*) and *B*(*H*)



#### Part II Between *C*0(*X*) and *B*(*H*)





## Introduction

After 25 years of confusion and even occasional despair, in March 1926 physicists suddenly had *two* theories of the microscopic world (Heisenberg, 1925; Schrodinger, ¨ 1926ab), which hardly could have looked more differently. Heisenberg's *matrix mechanics* (as it came to be called a bit later) described experimentally measurable quantities (i.e., "observables") in terms of discrete quantum numbers, and apparently lacked a state concept. Schrodinger's ¨ *wave mechanics* focused on unobservable continuous matter waves apparently playing the role of quantum states; at the time the only observable within reach of his theory was the energy. Einstein is even reported to have remarked in public that the two theories excluded each other.

Nonetheless, Pauli (in a letter to Jordan dated 12 April 1926), Schrodinger ¨ (1926c) himself, Eckart (1926), and Dirac (1927) argued—it is hard to speak of a complete argument even at a heuristic level, let alone of a mathematical *proof* (Muller, 1997ab)— that in fact the two theories were equivalent! A rigorous equivalence proof was given by von Neumann (1927ab), who (at the age of 23) was the first to unearth the mathematical structure of quantum mechanics as we still understand it today. His effort, culminating in his monograph *Mathematische Grundlagen der Quantenmechanik* (von Neumann, 1932), was based on the abstract concept of a *Hilbert space*, which previously had only appeared in examples (i.e. specific realizations) going back to the work of Hilbert and his school on integral equations.

The novelty of von Neumann's abstract approach may be illustrated by the advice Hilbert's former student Schmidt gave to von Neumann even at the end of the 1920s:

'Nein! Nein! Sagen Sie nicht Operator, sagen Sie Matrix!" (Bernkopf, 1967, p. 346).5

Von Neumann proposed that observables quantities be interpreted as (possibly unbounded) self-adjoint operators on some Hilbert space, whilst pure states are realized as rays (i.e. unit vectors up to a phase) in the same space; finally, the inner product provides the probabilities introduced by Born (1926ab). In particular, Heisenberg's observables were operators on -<sup>2</sup>(N), whereas Schrodinger's wave-functions ¨ were unit vectors in *L*2(R3). A unitary transformation between these Hilbert spaces then provided the mathematical equivalence between their competing theories.

<sup>5</sup> 'No! No! You shouldn't say operator, you should say matrix!'

This story is well known, but it is worth emphasizing (cf. Zalamea, 2016, §I.1) that the most significant difference between von Neumann's mathematical axiomatization of quantum mechanics and Dirac's heuristic but beautiful and systematic treatment of the same theory (Dirac, 1930) was not so much the lack of mathematical rigour in the latter—although this point was stressed by von Neumann (1932, p. 2) himself, who was particularly annoyed with Dirac's δ-function and his closely related assumption that every self-adjoint operator can be diagonalized in the naive way of having a basis of eigenvectors—but the fact that Dirac's approach was *relative* to the choice of a (generalized) basis of a Hilbert space, whereas von Neumann's was *absolute*. In this sense, as a special case of his (and Jordan's) general transformation theory, Dirac showed that Heisenberg's matrix mechanics and Schrodinger's ¨ wave mechanics were related by a (unitary) transformation, whereas for von Neumann they were two different realizations of his abstract (separable) Hilbert space. In particular, von Neumann's approach *a priori* dispenses with a basis choice altogether; this is precisely the difference between an *operator* and a *matrix* Schmidt alluded to in the above quotation. Indeed, von Neumann's abstract approach (which as a co-founder of functional analysis he shared with Banach, but not with his mentor Hilbert) was remarkable even in mathematics; in physics it must have been dazzling.

It is instructive to compare this situation with special relativity, where, so to speak, Dirac would write down the theory in terms of inertial frames of reference, so as to subsequently argue that due to Poincare-invariance the physical content of ´ the theory does not depend on such a choice. Von Neumann, on the other hand (had he ever written a treatise on relativity), would immediately present Minkowski's space-time picture of the theory and develop it in a coordinate-free fashion.

However, this analogy is also misleading. In special relativity, all choices of inertial frames are genuinely equivalent, but in quantum mechanics one often does have preferred observables: as Bohr would argue from his Como Lecture in 1927 onwards (Bohr, 1928), these observables are singled out by the choice of some experimental context, and they are jointly measurable iff they commute (see also below). Though not necessarily developed with Bohr's doctrine in mind, Dirac's approach seems tailor-made for this situation, since his basis choice is equivalent to a choice of "preferred" physical observables, namely those that are diagonal in the given basis (for Heisenberg this was energy, while for Schrodinger it was position). ¨

Von Neumann's abstract approach can deal with preferred observables and experimental contexts, too, though the formalism for doing so is more demanding. Namely, for reasons ranging from quantum theory to ergodic theory via unitary group representations on Hilbert space, from 1930 onwards von Neumann developed his theory of "rings of operators" (nowadays called *von Neumann algebras*), partly in collaboration with his assistant Murray (von Neumann, 1930, 1931, 1938, 1940, 1949; Murray & von Neumann, 1936, 1937, 1943). For us, at least at the moment the point is that Dirac's diagonal observables are formalized by *maximal commutative von Neumann algebras A* on some Hilbert space. These often come naturally with some specific realization of a Hilbert space; for example, on Heisenberg's Hilbert space -<sup>2</sup>(N) on has *Ad* = -<sup>∞</sup>(N), while Schrodinger's ¨ *L*2(R3) is host to *Ac* = *L*∞(R3), both realized as multiplication operators (cf. Proposition B.73).

Although the second (1931) paper in the above list shows that von Neumann was well aware of the importance of the commutative case of his theory of operator algebras, he—perhaps deliberately—missed the link with Bohr's ideas. As explained in the remainder of this Introduction, providing this link is one of the main themes of this book, but we will do so using the more powerful formalism of *C\*-algebras*. Introduced by Gelfand & Naimark (1943), these are abstractions and generalizations of von Neumann algebras, so abstract indeed that Hilbert spaces are not even mentioned in their definition. Nonetheless, C\*-algebras remain very closely tied to Hilbert spaces through the GNS-construction originating with Gelfand & Naimark (1943) and Segal (1947b), which implies that any C\*-algebra is isomorphic to a well-behaved algebra of bounded operators on some Hilbert space (see §C.12).

Starting with Segal (1947a), C\*-algebras have become an important tool in mathematical physics, where traditionally most applications have been to quantum systems with infinitely many degrees of freedom, such as quantum statistical mechanics in infinite volume (Ruelle, 1969; Israel, 1979; Bratteli & Robinson, 1981; Haag, 1992; Simon, 1993) and quantum field theory (Haag, 1992; Araki, 1999).

Although we delve from the first body of literature, and were at least influenced by the second, the present book employs C\*-algebras in a rather different fashion, in that we exploit the unification they provide of the commutative and the noncommutative "worlds" into a single mathematical framework (where one should note that as far as physics is concerned, the commutative or classical case is not purely C\*-algebraic in character, because one also needs a Poisson structure, see Chapter 3). This unified language (supplemented by some category theory, group(oid) theory, and differential geometry) gives a mathematical handle on Wittgenstein's *Aspektwechsel* between classical and quantum-mechanical modes of description (see Preface), which in our view lies at the heart of the foundations of quantum physics. This "change of perspective", which roughly speaking amounts to switching (and interpolating) between commutative and noncommutative C\*-algebras, is *added* to Dirac's transformation theory (which comes down to switching between generalized bases, or, equivalently, between maximal commutative von Neumann algebras).

The central conceptual importance of the *Aspektwechsel* for this book in turn derives from our adherence to Bohr's *doctrine of classical concepts*, which forms part of the *Copenhagen Interpretation* of quantum mechanics (here defined strictly as a body of ideas shared by Bohr and Heisenberg). We let the originators speak:

'It is decisive to recognize that, however far the phenomena transcend the scope of classical physical explanation, the account of all evidence must be expressed in classical terms. The argument is simply that by the word *experiment* we refer to a situation where we can tell others what we have done and what we have learned and that, therefore, the account of the experimental arrangements and of the results of the observations must be expressed in unambiguous language with suitable application of the terminology of classical physics.' (Bohr, 1949, p. 209)

'The Copenhagen interpretation of quantum theory starts from a paradox. Any experiment in physics, whether it refers to the phenomena of daily life or to atomic events, is to be described in the terms of classical physics. The concepts of classical physics form the language by which we describe the arrangement of our experiments and state the results. We cannot and should not replace these concepts by any others.' (Heisenberg 1958, p. 44)

The last quotation even opens Heisenberg's only systematic presentation of the Copenhagen Interpretation, which forms Chapter III of his Gifford Lectures from 1955; apparently this was the first occasion where the name "Copenhagen Interpretation" was used (Howard, 2004). In our view, several other defining claims of the Copenhagen Interpretation appear to be less well founded, if not unwarranted, although they may have been understandable in the historical context where they were first proposed (in which the new theory of quantum mechanics needed to get going even in the face of the foundational problems that all of the originators—including Bohr and Heisenberg—were keenly aware of). These spurious claims include:

• The emphatic rejection of the possibility to analyze what is going on during measurements, as expressed in typical Bohr parlance by claims like:

'According to the quantum theory, just the impossibility of neglecting the interaction with the agency of measurement means that every observation introduces a new uncontrollable element.' (Bohr, 1928, p. 584),

#### or, with similar (but somehow less off-putting) dogmatism by Heisenberg:

'So we cannot completely objectify the result of an observation' (1958, p. 50).

• The closely related interpretation of quantum-mechanical states (which Heisenberg indeed referred to as "probability functions") as mere catalogues of the probabilities attached to possible outcomes of experiments, as in:

'what one deduces from observation is a probability function, a mathematical expression that combines statements about possibilities or tendencies with statements about our knowledge of facts' (Heisenberg 1958, p. 50),

In addition, there are two ingredients of the avowed Copenhagen Interpretation Bohr and Heisenberg actually seem to have disagreed about. These include:


#### Let us now review the philosophical motivation Bohr and Heisenberg gave for their mutual doctrine of classical concepts. First, Bohr (in his typical convoluted prose):

'The elucidation of the paradoxes of atomic physics has disclosed the fact that the unavoidable interaction between the objects and the measuring instruments sets an absolute limit to the possibility of speaking of a behavior of atomic objects which is independent of the means of observation. We are here faced with an epistemological problem quite new in natural philosophy, where all description of experience has so far been based on the assumption, already inherent in ordinary conventions of language, that it is possible to distinguish sharply between the behavior of objects and the means of observation. This assumption is not only fully justified by all everyday experience but even constitutes the whole basis of classical physics. (. . . ) As soon as we are dealing, however, with phenomena like individual atomic processes which, due to their very nature, are essentially determined by the interaction between the objects in question and the measuring instruments necessary for the definition of the experimental arrangement, we are, therefore, forced to examine more closely the question of what kind of knowledge can be obtained concerning the objects. In this respect, we must, on the one hand, realize that the aim of every physical experiment to gain knowledge under reproducible and communicable conditions—leaves us no choice but to use everyday concepts, perhaps refined by the terminology of classical physics, not only in all accounts of the construction and manipulation of the measuring instruments but also in the description of the actual experimental results. On the other hand, it is equally important to understand that just this circumstance implies that no result of an experiment concerning a phenomenon which, in principle, lies outside the range of classical physics can be interpreted as giving information about independent properties of the objects.'

This text has been taken from Bohr (1958, p. 25), but very similar passages appear in many of Bohr's writings from his famous Como Lecture (Bohr, 1928) onwards. In other words, the (supposedly) unavoidable interaction between the objects and the measuring instruments, which for Bohr represents *the* characteristic feature of quantum mechanics (and which we would now express in terms of entanglement, of which concept Bohr evidently had an intuitive grasp), threatens the objectivity of the description that is characteristic of (if not the defining property of) of classical physics. However, this threat can be countered by describing quantum mechanics through classical physics, which (or so the argument goes) restores objectivity. Elsewhere, we see Bohr also insisting on the need for classical concepts in *defining* any meaningful theory whatsoever, as these are the only concepts we really understand (though, as he always insists, classical concepts are at the same time challenged by quantum theory, as a consequence of which their use is necessarily limited).

Although Heisenberg's arguments for the necessity of classical concepts start similarly, they eventually take a conspicuously different direction from Bohr's:

'To what extent, then, have we finally come to an objective description of the world, especially of the atomic world? In classical physics science started from the belief—or should one say from the illusion?—that we could describe the world or at least parts of the world without any reference to ourselves. This is actually possible to a large extent. We know that the city of London exists whether we see it or not. It may be said that classical physics is just that idealization in which we can speak about parts of the world without any reference to ourselves. Its success has led to the general ideal of an objective description of the world. Objectivity has become the first criterion for the value of any scientific result. Does the Copenhagen interpretation of quantum theory still comply with this ideal? One may perhaps say that quantum theory corresponds to this ideal as far as possible. Certainly quantum theory does not contain genuine subjective features, it does not introduce the mind of the physicist as a part of the atomic event. But it starts from the division of the world into the object and the rest of the world, and from the fact that at least for the rest of the world we use the classical concepts in our description. This division is arbitrary and historically a direct consequence of our scientific method; the use of the classical concepts is finally a consequence of the general human way of thinking. But this is already a reference to ourselves and in so far our description is not completely objective. (...)

The concepts of classical physics are just a refinement of the concepts of daily life and are an essential part of the language which forms the basis of all natural science. Our actual situation in science is such that we do use the classical concepts for the description of the experiments, and it was the problem of quantum theory to find theoretical interpretation of the experiments on this basis. There is no use in discussing what could be done if we were other beings than we are. (...)

Natural science does not simply describe and explain nature; it is a part of the interplay between nature and ourselves; it describes nature as exposed to our method of questioning.' (Heisenberg, 1958, p. 55–56, 56, 81)

The well-known last part may indeed have been the source of the crucial 'I'm the one who knocks' episode in the superb tv-series *Breaking Bad* (whose criminal main character operates under the cover name of "Heisenberg"). This is worth mentioning here, because Heisenberg (and to a lesser extent also Bohr) displays a puzzling mixture between the hubris of claiming that quantum mechanics has restored Man's position at the center of the universe and the modesty of recognizing that nonetheless Man has to know his limitations (in necessarily relying on the classical concepts he happens to be familiar with at the current state of evolution and science).

Our own reasons for favoring the doctrine of classical concepts are threefold. The first is closely related to Heisenberg's and may be expressed even better by the following passage from a book by the renowned Dutch primatologist Frans de Waal:

'*Die Verwandlung* [i.e., *The Metamorphosis* by Franz Kafka, in which Gregor Samsa famously wakes up to find himself transformed into an insect], published in 1915, was an unusual take-off for a century in which anthropocentrism declined. For metaphorical reasons, the author had picked a repulsive creature, forcing us from the first page onwards to feel what it would be like to be an insect. Around the same time, the German biologist Jakob von Uexkull drew attention to the fact that each particular species has its own per- ¨ spective, which he called its *Umwelt*. To illustrate this new idea, Uexkull took his readers ¨ on a tour through the worlds of various creatures. Each organism observes its environment in its own peculiar way, he argued. A tick, which has no eyes, climbs onto a grass blade, where it awaits the scent of butyric acid off the skin of mammals that pass by. Experiments have demonstrated that ticks may survive without food for as long as 18 years, so that a tick has ample time to wait for her prey, jump on it, and suck its warm blood, after which she is ready to lay her eggs and die. Are we in a position to understand the *Umwelt* of a tick? Its seems unbelievably poor compared to ours, but Uexkull regarded its simplicity rather as ¨ a strength: ticks have set themselves a narrow goal and hence cannot easily be distracted. Uexkull analysed many other examples, and showed how a single environment offers hun- ¨ dreds of different realities, each of which is unique for some given species. (...) Some animals merely register ultraviolet light, others live in a world of odors, or of touch, like a star nose mole. Some animals sit on a branch of an oak, others live underneath the bark of the same oak, whilst a fox family digs a hole underneath its roots. Each animal observes the tree differently.' (De Waal, 2016, pp. 15–16. Translation by the author).

Indeed, it is hardly an accident that De Waal preceded this passage by a quotation from Heisenberg almost identical to the last one above.

A second argument in favour of the doctrine lies in the possibility of a peaceful outcome of the Bohr–Einstein debate, or at least of an important part of it; cf. Landsman (2006a), which was inspired by earlier work of Raggio (1981, 1988) and Bacciagaluppi (1993). This debate initially centered on Einstein's attempts to debunk the Heisenberg uncertainty relations, and subsequently, following Einstein's grudging acceptance of their validity, entered its most famous and influential phase, in which Einstein tried to prove that quantum mechanics, although admittedly *correct*, was *incomplete*. One could argue that both antagonists eventually lost this part of the debate, since Einstein's goal of a local realistic (quantum) physics was quashed by the famous work of Bell (1964), whereas against Bohr's views, deterministic versions of quantum mechanics such as Bohmian mechanics and the Everett (i.e. Many Worlds) Interpretation turned out to be at least logical possibilities.

However incompatible the views of Einstein and Bohr on physics and its goals may have been, unknown to them a common battleground did in fact exist and could even have led to a reconciliation of at least the epistemological views of the great adversaries. The common ground referred to concerns the problem of *objectification*, which at first sight Bohr and Einstein approached in completely different ways:


'The belief in an external world independent of the perceiving subject is the basis of all natural science.' (Einstein, 1954, p. 266).

On a suitable mathematical interpretation, these conditions for the objectification of the system turn out to be equivalent! Namely, identifying Bohr's apparatus with Einstein's perceiving subject, calling its algebra of observables *A*, and denoting the algebra of observables of the quantum system to be objectified by *B*, our reading of the doctrine of classical concepts (to be explained in more detail below) is simply that *A* be commutative. Einstein, on the other hand, insists that the system under observation has its own state, so that there must be no entangled states on the tensor product *A*⊗*B* that describes the composite system. Equivalently, every pure state on *A*⊗*B* must be a product state, so that both *A* and *B* have states that together determine the joint state of *A*⊗*B*. This is the case if and only if *A* or *B* is commutative, and since *B* is taken to be a quantum system, it must be *A* (see the notes to §6.5 for details). Thus Bohr's objectification criterion turns out to coincide with Einstein's!

Thirdly, the doctrine of classical concepts describes all known applications to date of quantum theory to experimental physics; and therefore we simply have to use it if we are interested in understanding these applications. This is true for the entire range of empirically accessible energy and length scales, from molecular and condensed matter physics (including quantum computation) to high-energy physics (in colliders as well as in the context of astro-particle physics). So if people working in a field like quantum cosmology complain about the Copenhagen Interpretation then perhaps they should ask themselves if their field is more than a chimera.

Given its clear empirical relevance, it is a moot point whether the doctrine of classical concepts is as necessary as Bohr and Heisenberg claimed it was:

'In their attempts to formulate the general content of quantum mechanics, the representatives of the Copenhagen School often used formulations with which they do not merely say how things *are* in their opinion, but beyond that, they say that things *must* be thus and so (. . . ) They chose formulations for the mere communication of an item in which at the same time the inevitability of what is communicated is asserted. (. . . ) The assertion of the necessity of a proposition adds *nothing* to its content.' (Scheibe, 2001, pp. 402–403)

The doctrine of classical concepts implies in particular that the measuring apparatus is to be described classically; indeed, along with its coupling to the system undergoing measurement, it is its classical description which turns some device which *a priori* is a quantum system like anything else—into a measuring apparatus. This point was repeated over and over by Bohr and Heisenberg, but in our view the clearest explanation of this crucial point has been given by Scheibe:

'It is necessary to avoid any misunderstanding of the buffer postulate [i.e., the doctrine of classical concepts], and in particular to emphasize that the requirement of a classical description of the apparatus is not designed to set up a special class of objects differing fundamentally from those which occur in a quantum phenomenon as the things examined rather than measuring apparatus. This requirement is essentially epistemological, and affects this object only *in its role as apparatus*. A physical object which may act as apparatus may in principle also be the thing examined. (. . . ) The apparatus is governed by classical physics, the object by the quantum-mechanical formalism.' (Scheibe, 1973, p. 24–25)

Thus it is essential to the Copenhagen Interpretation that one can describe at least some quantum-mechanical devices classically: those for which this is possible include the candidate-apparatuses (i.e. measuring devices). In view of its importance for their interpretation of quantum mechanics, it is remarkable how little Bohr, Heisenberg, and their followers did to seriously address this problem of a dual description of at least part of the world, although they were clearly aware of this need:

'In the system to which the quantum mechanical formalism is to be applied, it is of course possible to include any intermediate auxiliary agency employed in the measuring process. Since, however, all those properties of such agencies which, according to the aim of measurements have to be compared with the corresponding properties of the object, must be described on classical lines, their quantum mechanical treatment will for this purpose be essentially equivalent with a classical description.' (Bohr, 1939, pp. 23–24; quotation taken from Camilleri & Schlosshauer, 2015, p. 79)

In defense of this alleged equivalence, we read almost circular explanations like:

'the necessity of basing the description of the properties and manipulation of the measuring instruments on purely classical ideas implies the neglect of all quantum effects in that description.' (Bohr, 1939, p. 19)

Since it delineates an appropriate regime, the following is slightly more informative:

'Incidentally, it may be remarked that the construction and the functioning of all apparatus like diaphragms and shutters, serving to define geometry and timing of the experimental arrangements, or photographic plates used for recording the localization of atomic objects, will depend on properties of materials which are themselves essentially determined by the quantum of action. Still, this circumstance is irrelevant for the study of simple atomic phenomena where, in the specification of the experimental conditions, we may to a very high degree of approximation disregard the molecular constitution of the measuring instruments. If only the instruments are sufficiently heavy compared with the atomic objects under investigation, we can in particular neglect the requirement of the [uncertainty] relation as regards the control of the localization in space and time of the single pieces of the apparatus relative to each other. (Bohr, 1948, pp. 315–316).

Even Heisenberg restricted himself to very general comments like:

'This follows mathematically from the fact that the laws of quantum theory are for the phenomena in which Planck's constant can be considered as a very small quantity, approximately identical with the classical laws. (Heisenberg, 1958, pp. 57).

Notwithstanding these vague or even circular explanations, the connection between classical and quantum mechanics was at the forefront of research in the early days of quantum theory, and even predated quantum mechanics. For example, Jammer (1966, p. 109) notes that already in 1906 Planck suggested that

'the classical theory can simply be characterized by the fact that the quantum of action becomes infinitesimally small.'

In fact, in the same context as Planck, namely his radiation formula, Einstein made a similar point already in 1905. Subsequently, Bohr's *Correspondence Principle*, which originated in the context of atomic radiation, suggested an asymptotic relationship between quantum mechanics and classical electrodynamics. As such, it played a major role in the creation of quantum mechanics (Bohr, 1976, Jammer, 1966, Mehra & Rechenberg, 1982; Hendry, 1984; Darrigol, 1992), but the contemporary (and historically inaccurate) interpretation of the Correspondence Principle as the idea that all of classical physics should be a certain limiting case of quantum physics seems of much later date (cf. Landsman, 2007a; Bokulich, 2008).

Ironically, the possibility of giving a dual classical–quantum description of measurement apparatuses, though obviously crucial for the consistency of the Copenhagen Interpretation, simply seems to have been taken for granted, whereas also the more ambitious problem of explaining at least the appearance of the classical world (i.e. beyond measurement devices) from quantum theory—which is central to current research in the foundations of quantum mechanics—is not to be found in the writings of Bohr (who, after all, saw the explanation of experiments as his job).

Perhaps Heisenberg could have used the excuse that he regarded the problem as solved by his 1927 paper on the uncertainty relations; but on both technical and conceptual grounds it would have been a feeble excuse. One of the few expressions of at least some dissatisfaction with the situation from within the Copenhagen school—if phrased ever so mildly—came from Bohr's former research associate Landau:

'Thus quantum mechanics occupies a very unusual place among physical theories: it contains classical mechanics as a limiting case, yet at the same time it requires this limiting case for its own formulation.' (Landau & Lifshitz, 1977, p. 3)

In other words, the relationship between the (generalized) Correspondence Principle and the doctrine of classical concepts needs to be clarified, and such a clarification should hopefully also provide the key for the solution of the grander problem of deriving the classical world from quantum theory under appropriate conditions.

As a first step to this end, Bohr's conceptual ideas should be interpreted within the formalism of quantum mechanics before they can be applied to the physical world, an intermediate step Bohr himself seems to have considered superfluous:

'I noticed that mathematical clarity had in itself no virtue for Bohr. He feared that the formal mathematical structure would obscure the physical core of the problem, and in any case, he was convinced that a complete physical explanation should absolutely precede the mathematical formulation.' (Heisenberg, 1967, p. 98)

Fortunately, von Neumann did not return the compliment, since beyond its brilliant mathematical content, his *Mathematische Grundlagen der Quantenmechanik* from 1932 devoted considerable attention to conceptual issues. For example, he gave the most general form of the Born rule (which is the central link between experimental physics and the Hilbert space formalism), he introduced density operators for quantum statistical mechanics (which are still in use), he conceptualized projection operators as yes-no questions (paving the way for his later development of quantum logic with Birkhoff, as well as for Gleason's Theorem and the like), in his analysis of hidden variables he introduced the mathematical concept of a state that became pivotal in operator algebras (including the algebraic approach to quantum mechanics), *en passant* also preparing the ground for the theorems of Bell and Kochen & Specker (which exclude hidden variables under physically more relevant assumptions than von Neumann's), and, last but not least, his final chapter on the measurement problem formed the basis for all serious subsequent literature on this topic.

Nonetheless, much as Bohr's philosophy of quantum mechanics would benefit from a precise mathematical interpretation, von Neumann's mathematics would be more effective in physics if it were supplemented by sound conceptual moves (beyond the ones he provided himself). Killing two birds with one stone, we implement the doctrine of classical concepts in the language of operator algebras, as follows:

#### *The physically relevant aspects of the noncommutative operator algebras of quantummechanical observables are only accessible through commutative algebras.*

Our *Bohrification program*, then, splits into two parts, which are distinguished by the precise relationship between a given noncommutative operator algebra *A* (representing the observables of some quantum system, as detailed below) and the commutative operator algebras (i.e. classical contexts) that give physical access to *A*.

While delineated mathematically, these two branches also reflect an unresolved conceptual disagreement between Bohr and Heisenberg about the status of classical concepts (Camilleri, 2009b). According to Bohr—haunted by his idea of Complementarity—only one classical concept (or one coherent family of classical concepts) applies to the experimental study of some quantum object at a time. If it applies, it does so exactly, and has the same meaning as in classical physics; in Bohr's view, any other meaning would be undefined. In a different experimental setup, some other classical concept may apply. Examples of such "complementary" pairs are particle versus wave (an example Bohr stopped using after a while), spacetime description versus "causal description" (by which Bohr means conservation laws), and, in his later years, one "phenomenon" (i.e., an indivisible unit of a quantum object plus an experimental arrangement) against another. For example:

'My main purpose (. . . ) is to emphasize that in the phenomena concerned we are (...) dealing with a rational discrimination between essentially different experimental arrangements and procedures which are suited either for an unambiguous use of the idea of space location, or for a legitimate application of the conservation theorem of momentum (...) which therefore in this sense may be considered as *complementary* to each other (...) Indeed we have in each experimental arrangement suited for the study of proper quantum phenomena not merely to do with an ignorance of the value of certain physical quantities, but with the impossibility of defining these quantities in an unambiguous way. (Bohr, 1935, p. 699).

Heisenberg, on the other hand, seems to have held a more relaxed attitude towards classical concepts, perhaps inspired by his famous 1925 paper on the quantummechanical reinterpretation (*Umdeutung*) of mechanical and kinematical relations, followed by his equally great paper from 1927 already mentioned. In the former, he introduced what we now call *quantization*, in putting the observables of classical physics (i.e. functions on phase space) on a new mathematical footing by turning them into what we now call operators (initially in the form of infinite matrices), where they also have new properties. In the latter, Heisenberg tried to find some operational meaning of these operators through measurement procedures. Since quantization applies to all classical observables at once, all classical concepts apply simultaneously, but approximately (ironically, like most research on quantum theory at the time, the 1925 paper was inspired by Bohr's Correspondence Principle).

To some extent, then, Bohr's view on classical concepts comes back mathematically in *exact Bohrification*, which studies (unital) commutative C\*-subalgebras *C* of a given (unital) noncommutative C\*-algebra *A*, whereas Heisenberg's interpretation of the doctrine resurfaces in *asymptotic Bohrification*, which involves asymptotic inclusions (more specifically, deformations) of commutative C\*-algebras into noncommutative ones. So the latter might have been called *Heisenbergification* instead, but in view of both the ugliness of this word and the historical role played by Bohr's Correspondence Principle just alluded to, the given name has stuck.

The precise relationship between Bohr's and Heisenberg's views, and hence also between exact and asymptotic Bohrification, remains to be clarified; their joint existence is unproblematic, however, since the two programs complement each other.

	- *The Born rule* (for single case probabilities).
	- *Gleason's Theorem* (which justifies von Neumann's notion of a state as a positive linear expectation value, assuming the operator part of quantum theory).
	- *The Kochen–Specker Theorem* (excluding non-contextual hidden variables).
	- *The Kadison–Singer Conjecture* (concerning uniqueness of extensions of pure states from maximal commutative C\*-subalgebras of the algebra *B*(*H*) of all bounded operators on a separable Hilbert space *H* to *B*(*H*)).
	- *Wigner's Theorem* (on unitary implementation of symmetries of pure states with transition probabilities, and its analogues for other quantum structures).
	- *Quantum logic* (which, if one adheres to the doctrine of classical concepts, turns out to be intuitionistic and hence distributive, rather than orthomodular).
	- *The topos-theoretic approach* to quantum mechanics (which from our point of view encompasses quantum logic and implies the preceding claim).
	- *The classical limit of quantum mechanics*.
	- *The Born rule* (for probabilities measured as long-run frequencies).
	- *The infinite-volume limit of quantum statistical mechanics*.
	- *Spontaneous symmetry breaking* (SSB).
	- *The Measurement Problem* (highlighted by Schrodinger's Cat). ¨

On the philosophical side, the limiting procedures inherent in asymptotic Bohrification may be seen in the light of the (alleged) phenomenon of *emergence*. From the philosophical literature, we have distilled two guiding thoughts which, in our opinion, should control the use of limits, idealizations, and emergence in physics and hence play a paramount role in this book. The first is *Earman's Principle*:

'While idealizations are useful and, perhaps, even essential to progress in physics, a sound principle of interpretation would seem to be that no effect can be counted as a genuine physical effect if it disappears when the idealizations are removed.' (Earman, 2004, p. 191)

The second is *Butterfield's Principle*, which in a sense is a corollary to Earman's Principle, and should be read in the light of Butterfield's own definition of emergence as 'behaviour that is novel and robust relative to some comparison class', which among other virtues removes the reduction-emergence opposition:

"there is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to the limit, i.e. for finite *N*. And it is this weaker behaviour which is physically real." (Butterfield, 2011, p. 1065)

Indeed, the link between theory and reality stands or falls with an adherence to these principles, for real materials (like a ferromagnet or a cat) are described by the *quantum* theory of *finite* systems (i.e., *h*¯ > 0 or *N* < ∞, as opposed to their idealized limiting cases *h*¯ = 0 or *N* = ∞), and yet they do display the remarkable phenomena that strictly speaking are only possible in the corresponding limit theories, like symmetry breaking, or the fact that cats are either dead or alive, as a metaphor for the fact that measurements have outcomes. This simple observation shows that any physically relevant conclusion drawn from some idealization must be foreshadowed in the underlying theory already for positive values of *h*¯ or finite values of *N*.

Despite their obvious validity, it is remarkable how often idealizations violate these principles. For example, all rigorous theories of spontaneous symmetry breaking in quantum statistical mechanics (Bratteli & Robinson, 1981) and in quantum field theory (Haag, 1992) strictly apply to infinite systems only, since ground states of finite quantum systems are typically unique (and hence symmetric), whilst thermal equilibrium states of such systems are even always unique (see also Chapter 10). As explained in Chapter 11, the "Swiss" approach to the measurement problem based on superselection rules faces a similar problem, and must be discarded for that reason. Bohr's doctrine of classical concepts is particularly vulnerable to Earman's Principle, since classical physics (in whose language we are supposed to express the account of all evidence) is not realized in nature but only in the human mind, so to speak. This necessitates great care in implementing this doctrine.

Interestingly, in his famous lecture "Uber das Unendliche", in which he ex- ¨ pounded his finitary program intended to save mathematics against the devilish intuitionist challenge of L.E.J. Brouwer, Hilbert (1925) expressed similar principles controlling the use of infinite idealizations in mathematics:

"Und so wie bei den Grenzprozessen der Infinitesimalrechnung das Unendliche im Sinne des Unendlichkleinen und des Unendlichgroßen sich als eine bloße Redensart erweisen ließ, so mussen wir auch das Unendliche im Sinne der Unendlichen Gesamtheit, wo wir es jetzt ¨ noch in den Schlußweisen vorfinden, als etwas bloß scheinbaren erkennen. Und so wie das Operieren mit dem Unendlichkleinen durch Prozesse im Endlichen ersetzt wurde, welche ganz dasselbe leisten und zu ganz denselben eleganten formalen Beziehungen fuhren, so ¨ mussen ¨ uberhaupt die Schlußweisen mit dem Unendlichen durch endliche Prozesse ersetzt ¨ werden, die gerade dasselbe leisten, d.h. dieselben Beweisgange und dieselben Methoden ¨ der Gewinning von Formeln und Satzen erm ¨ oglichen." (Hilbert, 1925, p. 162). ¨ <sup>6</sup>

In addition, asymptotic Bohrification has three rather more technical roots:


<sup>6</sup> 'Just as in the limit processes of the infinitesimal calculus, the infinite in the sense of the infinitely large and the infinitely small proved to be merely a figure of speech, so too we must realize that the infinite in the sense of an infinite totality, where we still find it in deductive methods, is an illusion. Just as operations with the infinitely small were replaced by operations with the finite which yielded exactly the same results and led to exactly the same elegant formal relationships, so in general must deductive methods based on the infinite be replaced by finite procedures which yield exactly the same results, i.e., which make possible the same chains of proofs and the same methods of getting formulas and theorems.' (Benaceraff & Putnam, 1983, p. 184).

This book is organized into two parts. Rather than following the partition of our approach into exact and asymptotic Bohrification, these parts reflect the (mathematical) sophistication of the material, starting with finite sets, and ending with a combination of C\*-algebras and topos theory. Part I, called *C*0(*X*) *and B*(*H*), gives a mathematical introduction to both classical and quantum mechanics from an operator-algebraic point of view, in which these theories are kept separate, whilst mathematical analogies are stressed whenever possible. This part emphasizes the notion of symmetry, and includes some of the main abstract mathematical results about quantum mechanics (i.e., those not involving the study of Schrodinger op- ¨ erators and concrete models), such as the Born rule, the theorems of Gleason and Kochen & Specker already mentioned, the one of Wigner (on symmetries) and its numerous derivatives, including a new one on unitary implementability of symmetries of the poset C (*B*(*H*)) of unital commutative C\*-subalgebras of *B*(*H*), and Stone's Theorem on unitary implementability of time evolution in quantum mechanics. This part may also serve as a reference for such fundamental theorems about quantum mechanics. An unusual ingredient of this part is our discussion of the Kadison–Singer Conjecture, included because of its fit into (exact) Bohrification. Also elsewhere, results are (re)phrased in a language appropriate to this ideology.

Experts in the C\*-algebraic approach to quantum mechanics will be able to read the second part independently of the first (which they might therefore skip if they find it to be too elementary), but the spirit of Bohrification will only be instilled in the reader if (s)he reads the entire book; indeed, it is this very spirit that keeps the two parts together and turns the book into a whole. Part II, entitled *Between C*0(*X*) *and B*(*H*), starts with a survey of some known results on the grey area between classical and quantum, such as Bell's Theorem(s) and the so-called Free Will Theorem. It then embarks on the asymptotic Bohrification program, including (deformation) quantization and the classical limit (including a small excursion into indistinguishable particles), large systems and their (thermodynamic) limit, and the Born rule (revisited). This part centers on a somewhat idiosyncratic treatment of spontaneous symmetry breaking (SSB) and the closely related measurement problem of quantum mechanics, which is given an unusual but technically precise formulation in the spirit of the Copenhagen Interpretation, and hence is meant to be relevant to actual experimental physics (which is what the Copenhagen Interpretation covers).

Our treatment of both quantization and SSB relies mathematically on continuous bundles of C\*-algebras, while the principles of Earman and Butterfield provide philosophical guidance. This is also true for our approach to the measurement problem, which combines elements of quantization and SSB. Although experiments and detailed theoretical models are lacking so far, this powerful combination of mathematical and philosophical tools leads to a compelling scenario for solving the measurement problem, harboring the hope of finally laying this problem to rest. Like dynamical collapse models that require modifications of quantum mechanics, our scenario looks at the wave-function realistically, and hence describes measurement as a physical process, including the collapse that settles the outcome (as opposed to reinterpretations of the uncollapsed state, as in modal or Everettian interpretations). However, in our approach collapse takes place within unitary quantum theory.

Insolubility theorems for the measurement problem are circumvented, because these rely on the counterfactual that *if* ψ*<sup>n</sup> were* the initial state, then *for each n* it *would* evolve (linearly) according to the Schrodinger equation with ¨ *given* Hamiltonian *h*, whereas *if* the initial state *were* ∑*<sup>n</sup> cn*ψ*n*, also then it *would* evolve according to the *same* Hamiltonian *h*. However, Butterfield's Principle implies that this counterfactual is inapplicable precisely in the measurement situations it is meant for, because the dual description of the apparatus as both classical and quantummechanical causes extreme sensitivity of the wave-function to even the tiniest perturbations of the Hamiltonian. Indeed, such perturbations dynamically enforce some particular outcome of the measurement. Our scenario also rejects the typical way of looking at measurement as a two-step process (going back to von Neumann himself and widely adopted in the literature ever since), i.e., of firstly a transition of a pure state to a mixed one (this is his ill-fated "process 1"), followed by the registration of a single outcome. In real measurements (like elsewhere), pure states remain pure! If our scenario is correct, the mistaken impression that quantum theory seems to imply the irreducible randomness of nature, then arises because measurement outcomes are merely unpredictable "for all practical purposes", indeed they are unpredictable in a way that dwarfs even the apparent randomness of classical chaotic systems.

The final chapter on topos theory and quantum logic elaborates on ideas originating with Isham and Butterfield. It centers on the poset C (*A*) of all unital commutative C\*-subalgebras of a unital C\*-algebra *A*, ordered by inclusion; with some goodwill, one might call C (*A*) the mathematical home of Complementarity (although the construction applies even when *A* itself is commutative). The power of this poset is already clear in Part I, where the special case *A* = *B*(*H*) leads to a new version of Wigner Theorem on unitary implementability of symmetries. Hamhalter's Theorem, which is a far-reaching generalization of this version, then shows that C (*A*) carries at least as much information about *A* as the pure state space. Furthermore, C (*A*) enforces a (new) notion of quantum logic that turns out to be *intuitionistic* in being distributive but denying the law of the excluded middle (on which both classical logic and the non-distributive quantum logic of Birkhoff–von Neumann are based). Finally, C (*A*) gives rise to a quantum phase space (which is lacking in the usual formalism), on which observables are functions and states are probability measures, just like in classical physics (but now "internal" to a particular topos, i.e., a mathematical universe alternative to set theory, in which logic is typically intuitionistic).

About a third of the book is devoted to mathematical appendices. Those on functional analysis and operator algebras give thorough introductions to these subjects, sparing the reader the effort to study books like Bratteli & Robinson (1981), Conway (2007), Dudley (1989), Kadison & Ringrose (1983, 1986), Lance (1995), Pedersen (1989), Reed & Simon (1972), Schmudgen (2012), and Takesaki (2002, 2003). ¨ The appendices on logic, category theory, and topos theory, on the other hand, are far from exhaustive (though self-contained): they provide a shortcut to the necessary parts of e.g. Johnstone (1987), Mac Lane (1998), and Mac Lane & Moerdijk (1992), or, alternatively, of Bell & Machover (1977) and Bell (1988). Though primarily meant to support the main body of the book, these appendices may also be of some interest by themselves, especially to philosophers, but even to mathematicians.

As a "Quick Start Guide" for readers in a hurry, we now summarize the main definitions in the theory of operator algebras. A *C\*-algebra* is an associative algebra (over C) equipped with an involution (i.e., a real-linear map *a* → *a*<sup>∗</sup> such that

$$a^{\*\*} = a, \ (ab)^{\*} = b^{\*}a^{\*}, \ (\lambda a)^{\*} = \overline{\lambda}a^{\*},$$

for all *a*,*b* ∈ *A* and λ ∈ C), as well as a norm in which *A* is complete (i.e., a Banach space), such that algebra, involution, and norm are related by the axioms

$$\begin{aligned} \|ab\| &\le \|a\| \, \|b\|; \\ \|a^\*a\| &= \|a\|^2. \end{aligned}$$

The two main classes of C\*-algebras are:

• The space*C*0(*X*) of all continuous functions *f* : *X* → C that vanish at infinity (i.e., for any ε > 0 the set {*x* ∈ *X* | | *f*(*x*)| ≥ ε} is compact), where *X* is some locally compact Hausdorff space, with pointwise addition and multiplication, involution

$$f^\*(x) = f(x),$$

and a norm

$$\|f\|\_{\simeq} = \sup\_{x \in X} \{|f(x)|\}.$$

It is of fundamental importance for physics and mathematics that *C*0(*X*) is *commutative*. Conversely, Gelfand & Naimark (1943) proved that every commutative C\*-algebra is isomorphic to *C*0(*X*) for some locally compact Hausdorff space *X*, which is determined by *A* up to homeomorphism (*X* is called the *Gelfand spectrum* of *A*). Note that *C*0(*X*) has a unit (i.e. the function 1*<sup>X</sup>* that is equal to 1 for any *x*) iff *X* is compact.

• Norm-closed subalgebras *A* of the space *B*(*H*) of all bounded operators on some Hilbert space *H* for which *a*<sup>∗</sup> ∈ *A* iff *a* ∈ *A*; this includes the case *A* = *B*(*H*). Here one uses the standard operator norm

$$||a|| = \sup\{||a\Psi||, \Psi \in H, ||\Psi|| = 1\},$$

the algebraic operations are the natural ones, and the involution is the adjoint. If dim(*H*) > 1, *B*(*H*) is a *non-commutative* C\*-algebra. An important special case is the C\*-algebra *B*0(*H*) of all *compact* operators on *H*, which has no unit whenever *H* is infinite-dimensional (whereas *B*(*H*) is always unital). In their fundamental paper, Gelfand & Naimark (1943) also proved that every C\*-algebra is isomorphic to *A* ⊂ *B*(*H*) for some Hilbert space space *X*.

These classes are related as follows: in the commutative case *A* = *C*0(*X*), take

$$H = L^2(X, \mu),$$

where the support of the measure μ is *X*, on which *C*0(*X*) acts by multiplication operators, that is, *mf*<sup>ψ</sup> <sup>=</sup> *<sup>f</sup>*ψ, where *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*X*) and <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(*X*,μ).

As already noted, C\*-algebras were introduced by Gelfand & Naimark (1943), generalizing the rings of operators studied by von Neumann during 1930–1949, partly in collaboration with Murray (von Neumann, 1930, 1931, 1938, 1940, 1949; Murray & von Neumann, 1936, 1937, 1943). These rings are now called *von Neumann algebras*, and arise as the special case where a C\*-algebra *A* ⊂ *B*(*H*) satisfies

$$A = A'',$$

in which for any subset *S* ⊂ *B*(*H*) the *commutant* of *S* is defined by

$$S' = \{ a \in \mathcal{B}(H) \mid ab = ba \,\forall b \in \mathcal{S} \},$$

in terms of which the *bicommutant* of *S* is given by *S* = (*S* ) . Equivalently, a C\* algebra is a von Neumann algebra *M* iff it is the dual of some Banach space *M*<sup>∗</sup> (which is unique, and contains the so-called *normal states* on *M*).

Generalizing von Neumann's concept of a state on *B*(*H*), a *state* on a C\*-algebra *A* (as first defined by Segal in 1947) is a linear map

$$o: A \to \mathbb{C}$$

that is *positive* in that

$$o(a^\*a) \ge 0$$

for each *a* ∈ *A*, and *normalized* in that, noting that positivity implies boundedness,

$$||a|| = 1,$$

where · is the usual norm on the Banach dual *A*∗. If *A* has a unit 1*A*, then in the presence of positivity, the above normalization condition is equivalent to

$$
\mathfrak{o}(1\_A) = 1.
$$

The Riesz–Radon representation theorem in measure theory gives a bijective correspondence between states ω on *A* = *C*0(*X*) and probability measures μ on *X*, viz.

$$
\mathfrak{o}(f) = \int\_X d\mathfrak{\mu} \, f,
$$

for any *f* ∈ *C*0(*X*). At the other end of the operator-algebraic world, if *A* = *B*(*H*), then any density operator ρ on *H* gives a state ω on *B*(*H*) by

$$
\mathfrak{o}(a) = \mathrm{Tr}\left(\mathfrak{p}a\right),
$$

but if *H* is infinite-dimensional there are other states, which cannot be normal. Such "singular" states are the C\*-algebraic analogues of improper eigenstates for eigenvalues in the continuous spectrum of some self-adjoint operator (think of position or momentum), and hence they make perfect sense physically. Singular states play an important role also mathematically, especially in the Kadison–Singer Conjecture.

Let me close this Introduction with a small personal note on the way this book came into being. Of the three disciplines relevant to the foundations of physics, namely mathematics, physics, and philosophy, my expertise has always been located within the first two, more specifically in mathematical physics. Nonetheless, my interest in the foundations of physics was triggered already at school, notably by books like *The Dancing Wu-Li Masters* by Gary Zukav, *The Tao of Physics* by Fritjof Capra (both of which may appear suspicious in hindsight), and especially by Werner Heisenberg's fascinating (though historically unreliable) autobiography *Physics and Beyond* (called *Der Teil und das Ganze* in German). The second autobiography that made a huge impression on me at the time was Bertrand Russell's, which in particular made me want to go to Cambridge and become a so-called Apostle (i.e. a member of an elitist secret conversation society that once included such illustrious members as Moore, Keynes, Hardy, and Russell himself); the first dream was eventually realized (see below), about the second I have to remain silent.

My interest in foundations was reinforced by two books on general relativity which I read as a first-year physics student, namely *Raum* · *Zeit* · *Materie* by Weyl (1918) and *The Mathematical Theory of Relativity* by Eddington (1923). Although these were beyond my grasp at the time, they were clearly written in the spirit of Newton's *Principia*, in that they were primarily treatises in natural philosophy, for which mathematical physics just provided the technical underpinning. Nonetheless, despite an unforgettable seminar by Jan Hilgevoord on the Heisenberg uncertainty relations in 1984, reporting on his recent joint work with Jos Uffink, foundations remained dormant during my undergraduate and PhD years (1981–1989).

As a postdoc in Cambridge from 1989 onwards, I initially attended all seminars in any subject related to mathematics and/or physics I found remotely interesting, including the so-called *Sigma Club*, which at the time was organized by Michael Redhead. Michael was surrounded by a group of people I began to increasingly like, although I was and still am worried by their deification of John Bell (one speaker even asked his audience to stand whilst he was reading a passage from *Speakable and Unspeakable in Quantum Mechanics*). In any case, I was very kindly invited to speak at the Sigma Club on my recent paper on superselection rules and the measurement problem (whose approach I now eschew, since it violates Earman's Principle, see above as well as Chapter 11 below), followed by a private dinner in the posh Riverside Restaurant with Michael (who asked my opinion about David Lewis, whom I unfortunately had never heard of). Indeed, the generosity of inviting an absolute beginner in the philosophy of physics to speak in such a prestigious seminar endeared me even further to both the subject and the community.

My main business remained mathematical physics, but, reinforcing the earlier spark I had got from reading Weyl and Eddington (and later also from von Neumann as well as Newton), two people (unfortunately no longer with us) made it clear to me that the goal of this discipline may include not only mathematics and physics, but also foundations, i.e., natural philosophy. These were Rob Clifton, who was a PhD student of Redhead and Butterfield, and Rudolf Haag, in whose group I had the honour to work during my year at Hamburg (1993-1994) as an Alexander von Humboldt Fellow (this was Haag's last active year at the university, cf. Haag, 2010).

#### Introduction 19

My first book in 1998, which I wrote during my last two years at Cambridge, when the prospect of having to leave Academia and hence the urge to leave a permanent record loomed large, did not yet reflect this attitude. But my lengthy article on the classical-quantum interface in the *Handbook of the Philosophy of Physics* edited by Butterfield and Earman already did, and so does the present book.

There is an inherent danger in a mathematical physics approach to foundations:

'I'm guided by the beauty of our weapons' (Leonard Cohen)

Our mathematical weapons, that is; this book is predicated on the idea that operator algebras provide the right language for quantum theory. If they don't—for example, if path integrals are really its essence, as researchers especially in quantum gravity seem to believe, and there turns out to be a difference between the two toolkits—the mathematical underpinning of Bohrification would fall. Since our conceptual program is closely linked to this mathematical language, it would presumably collapse, too. Even if operator algebras stand, once some noncommutative alien gets direct access to the quantum world in defiance of Bohr's doctrine of classical concepts, the conceptual framework behind Bohrification (and with it much of this book) would tremble. So far there has been no evidence for any of this, and as long as physics remains an empirical science I offer this book to the reader both as an introduction to modern mathematical methods in physics (in so far as these are relevant to foundational questions), and also as an alternative to various interpretations of quantum mechanics that seem to philosophize the physics of the problems away.

#### *Notes*

Each chapter is followed by a section called *Notes*, in which background and credits for the results in the given chapter are given. Such information is therefore absent in the main text (expect when—typically famous—theorems are named after their discoverers, like Gleason, Wigner, and the like). This Introduction, which anomalously contains some references, is an exception, but we still provide some notes to it.

Since this book is not an exegesis of Bohr but rather an exposition of some mathematical ideas partly inspired by his work (with no claim to retroactive endorsement by Bohr or his followers), we hardly relied on the secondary literature on his philosophy, except, as already mentioned, on Scheibe (1973) and Beller (1999), both of which are pretty critical of Bohr. For a more balanced picture, one might consult monographs like Folse (1985), Murdoch (1987), McEvoy (2001), Brock (2003), the collection of essays edited by Faye & Folse (2017), as well as Dieks (2016a) and Zinkernagel (2016). Secondary literature on Heisenberg's philosophy of physics is scarce, but includes Camilleri (2009b). Though irrelevant to the present book, one cannot resist mentioning Landsman (2002) on Heisenberg's controversial political war record, from which he tried to escape by writing the intriguing essay *Ordnung der Wirklichkeit*, published 50 years later as Heisenberg (1994).

*A propos*, notes on von Neumann and operator algebras follow §C.25.

Strictly speaking, no previous knowledge of quantum mechanics is needed to understand this book, but it is hard to imagine readers of this book without such a background. Beyond standard undergraduate physics courses, for mathematically serious introductions to quantum mechanics—further to von Neumann (1932), which founded the subject—we recommend Bongaarts (2015), Gustafson & Sigal (2003), Hall (2013), Takhtajan (2008), and Thirring (2002). No previous acquaintance with the philosophy of quantum theory is required either, but once again it might be expected that typical readers of the present book have at least some awareness of this field. In fact, the author himself has only read a few such books from cover to cover, including Heisenberg (1958), Jammer (1966, 1974), Scheibe (1973), Earman (1986), van Fraassen (1991), Bub (1997), Beller (1999), and Wallace (2012).

From these books, apart from its obvious source Heisenberg (1958), Bohrification (at least in its 'exact' variant) is conceptually akin to the program of Bub (1997), which was based on Clifton & Bub (1996); the past tense seems appropriate here, since Bub has meanwhile abandoned this program in favour of foundations based on information theory (Bub, 2004). Anyway, given some preferred observable *a* ∈ *B*(*H*)sa and pure state *e* ∈ P1(*H*) (i.e., a one-dimensional projection on *H*), the Bub–Clifton approach looks for the largest C\*-subalgebra *A* of *B*(*H*) on which one may define something like a hidden variable compatible with the Born probabilities emanating from the *given* state *e* (the emphasis on some given *e* comes form the modal interpretation(s) of quantum mechanics). For generic states *e* and observables *a*, this typically allows *A* to be noncommutative, which blasts the conceptual framework of exact Bohrification. Requiring compatibility with quantum mechanics for *arbitrary* states *e*, on the other hand, would force *A* to be commutative. All this relates to the Kochen–Specker Theorem; see the Notes to §6.1 for further details.

Finally, though remote from Wallace (2012) in our attempt to solve (or, in the light of the first quotation below, one should say "address") the measurement problem through physics rather than philosophy, even with this polar opposite author we share the following attitude towards the foundations of quantum mechanics:

'The basic thesis of this book is that there is no quantum measurement problem (...) What I mean is that there is actually no conflict between the dynamics and ontology of (unitary) quantum theory and our empirical observations. (. . . ) [I do not] wish to be read as offering yet one more "interpretation of quantum mechanics".

This book takes an extremely conservative approach to quantum mechanics (...) quantum mechanics can be taken literally (...) there is just unitary quantum mechanics.

The way in which cats or tables exist is as structures within the underlying microphysics (. . . ) [they are] emergent objects, higher-order entities.' (Wallace, 2012, pp. 1, 2, 13, 38, 40)

But although it may indeed apply to the town of Oxford, one might take issue with:

'It is simply false that there are alternative explanatory theories to Everett-interpreted quantum mechanics which can reproduce the predictions of quantum theory (. . . ) The Everett interpretation is the only game in town.' (Wallace, 2012, p. 43)

## Part I *C*0(*X*) and *B*(*H*)

## Chapter 1 Classical physics on a finite phase space

Throughout this chapter, *X* is a *finite set*, playing the role of the configuration space of some physical system, or, equivalently (as we shall see), of its pure state space (in the continuous case, *X* will be the phase space rather than the configuration space). One should not frown upon finite sets: for example, the configuration space of *N* bits is given by *X* = 2*N*, where for arbitrary sets *Y* and *Z*, the set *Y<sup>Z</sup>* consists of all functions *x* : *Z* → *Y*, and for any *N* ∈ N we write *N* = {1,2,...,*N*} (although, following the computer scientists, 2 usually denotes {0,1}). More generally, if one has a lattice <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* and each site is the home of some classical object (say a "spin") that may assume *<sup>N</sup>* different configurations, then *<sup>X</sup>* <sup>=</sup> *<sup>N</sup>*<sup>Λ</sup> , in that *<sup>x</sup>* : <sup>Λ</sup> <sup>→</sup> *<sup>N</sup>* describes the configuration in which the "spin" at site n ∈ Λ takes the value *x*(n) ∈ *N*.

Although the setting is *a priori* deterministic, in that (knowing) some point *x* ∈ *X* in its guise as a pure state at least in principle determines everything (there is to say), the mathematical language will be probabilistic. Even within the confines of classicality this allows one to do statistical physics, and as such it also sheds light on e.g. the special status of *x* as an extreme probability measure (see below). Furthermore, the use of this language may be motivated by the goal of describing classical and quantum mechanics as analogously as possible at this elementary level.

The following concepts play a central role in this chapter. Recall that the power set P(*X*) of *X* is the set of all subsets of *X* (for finite *X*, these are all measurable).

Definition 1.1. *1. An* event *is a subset U* ⊆ *X, i.e., U* ∈ P(*X*)*.*


$$P(U|V) = \frac{P(U \cap V)}{P(V)}.\tag{1.1}$$


K. Landsman, *Foundations of Quantum Theory*,

Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3\_1

#### 1.1 Basic constructions of probability theory

Probability distributions *p* and probability measures *P* determine each other by

$$P(U) = \sum\_{\mathbf{x} \in U} p(\mathbf{x});\tag{1.2}$$

$$p(\mathbf{x}) = P(\{\mathbf{x}\}),\tag{1.3}$$

but this is peculiar to finite sets (in general, probability *measures* will be primary). Two special classes of probability measures and of random variables stand out:


The single most important construction in probability theory is as follows.

Theorem 1.2. *A probability distribution p on X and a random variable f* : *X* → R *jointly yield a probability distribution pf on the spectrum* σ(*f*) *by means of*

$$p\_f(\lambda) = \sum\_{\mathbf{x} \in X \mid f(\mathbf{x}) = \lambda} p(\mathbf{x}). \tag{1.4}$$

*In terms of the corresponding probability measure P on X, one has*

$$p\_f(\mathcal{A}) = P(f = \mathcal{A}),\tag{1.5}$$

*where f* = λ *denotes the event* {*x* ∈ *X* | *f*(*x*) = λ} *in X. Similarly, the probability measure Pf on* σ(*f*) *corresponding to the probability distribution pf is given by*

$$P\_f(\Delta) = P(f \in \Delta),\tag{1.6}$$

*where* Δ ⊆ σ(*f*) *and f* ∈ Δ *denotes the event* {*x* ∈ *X* | *f*(*x*) ∈ Δ} *in X.*

The proof is trivial. Instead of *<sup>f</sup>* <sup>=</sup> <sup>λ</sup>, the notation *<sup>f</sup>* <sup>−</sup>1({λ}) might be used, and similarly, *<sup>f</sup>* <sup>−</sup>1(Δ) is the same as *<sup>f</sup>* <sup>∈</sup> <sup>Δ</sup>. If <sup>λ</sup> <sup>∈</sup> <sup>σ</sup>(*f*) is non-degenerate in that there is exactly one *x*<sup>λ</sup> ∈ *X* such that *f*(*x*<sup>λ</sup> ) = λ, then one simply has *P*(*f* = λ) = *p*(*x*<sup>λ</sup> ).

For example, combining both our special cases *P* = *Py* and *f* = 1*<sup>U</sup>* above yields

$$P\_\mathbf{y}(1\_U=1) = 1 \text{ and } P\_\mathbf{y}(1\_U=0) = 0 \text{ if } \mathbf{y} \in U;\tag{1.7}$$

$$P\_\mathbf{y}(1\_U=1) = 0 \text{ and } P\_\mathbf{y}(1\_U=0) = 1 \text{ if } \mathbf{y} \notin U. \tag{1.8}$$

Given some probability measure *P*, the *expectation value EP*(*f*) and the *variance* Δ*P*(*f*) of a random variable *f* with respect to *P* are defined by, respectively,

$$E\_P(f) = \sum\_{\mathbf{x} \in X} f(\mathbf{x}) p(\mathbf{x});\tag{1.9}$$

$$
\Delta\_P(f) = E\_P(f^2) - E\_P(f)^2. \tag{1.10}
$$

A simple calculation shows that *EP* may be written directly in terms of *P* itself as

$$E\_P(f) = \sum\_{\lambda \in \sigma(f)} P(f = \lambda) \cdot \lambda. \tag{1.11}$$

Note that Δ*P*(*f*) ≥ 0. The special role of the point measures *Py* may now be clarified:

Proposition 1.3. *A probability measure P takes the form P* = *Py for some y* ∈ *X iff* Δ*P*(*f*) = 0 *for all random variables f* : *X* → R*.*

*Proof.* For "⇒", we compute *EPy* (*f*) = *<sup>f</sup>*(*y*), and hence *EPy* (*<sup>f</sup>* <sup>2</sup>) = *<sup>f</sup>*(*y*)2. In the opposite direction, take *<sup>f</sup>* <sup>=</sup> *py*, so that *<sup>f</sup>* <sup>2</sup> <sup>=</sup> *<sup>f</sup>* and hence <sup>Δ</sup>*P*(*f*) = *<sup>p</sup>*(*y*) <sup>−</sup> *<sup>p</sup>*(*y*)2. The assumption Δ*P*(*f*) = 0 for each *f* implies that either *p*(*y*) = 0 or *p*(*y*) = 1 for each *y* ∈ *X*. Definition 1.1.2 then implies that *p*(*y*) = 1 for exactly one *y* ∈ *X*. -

More generally, a collection *f*1,..., *fn* of *n* random variables and a (single) probability distribution *p* on *X* jointly define a probability distribution *pf*1,..., *fn* on the product σ(*f*1)×···×σ(*fn*) of the individual spectra by

$$p\_{f\_1\dots f\_n}(\lambda\_1,\dots,\lambda\_n) = \sum\_{\mathbf{x}\in X|f\_1(\mathbf{x})=\lambda\_1,\dots,f\_n(\mathbf{x})=\lambda\_n} p(\mathbf{x}).\tag{1.12}$$

Once again, this may be rewritten as

$$P\_{f\_1\ldots f\_n}(\lambda\_1,\ldots,\lambda\_n) = P(f\_1 = \lambda\_1,\ldots,f\_n = \lambda\_n),\tag{1.13}$$

where the argument of *<sup>P</sup>* denotes the intersection <sup>∩</sup>*<sup>n</sup> <sup>k</sup>*=1(*fk* = λ*k*), i.e.,

$$P(f\_1 = \lambda\_1, \dots, f\_n = \lambda\_n) = \{ \mathbf{x} \in X \mid f\_1(\mathbf{x}) = \lambda\_1, \dots, f\_n(\mathbf{x}) = \lambda\_n \}. \tag{1.14}$$

Simple calculations then yield results for the so-called *marginal distributions*, like

$$\sum\_{\lambda\_{l+1}\in\sigma(f\_{l+1}),\ldots,\lambda\_{\mathfrak{n}}\in\sigma(f\_{\mathfrak{n}})} P(f\_1=\lambda\_1,\ldots,f\_{\mathfrak{n}}=\lambda\_{\mathfrak{n}}) = P(f\_1=\lambda\_1,\ldots,f\_l=\lambda\_l),\quad(1.15)$$

where 1 ≤ *l* < *n*. The above constructions also apply to the corresponding conditional probabilities: given *m* additional random variables *a*1,...,*am*, one has

$$\sum\_{\lambda\_{l+1}\in\sigma(f\_{l+1}),\ldots,\lambda\_n\in\sigma(f\_n)} P(f\_1=\lambda\_1,\ldots,f\_n=\lambda\_n|a\_1=a\_1,\ldots a\_m=a\_m) \quad (1.16)$$

$$\mathbf{x} = P(f\_1 = \lambda\_1, \dots, f\_l = \lambda\_l | a\_1 = \alpha\_1, \dots, a\_m = \alpha\_m). \quad (1.17)$$

#### 1.2 Classical observables and states

Given a finite set *X*, we may form the set *C*(*X*) of all complex-valued functions on *X*, enriched with the structure of a complex vector space under pointwise operations:

$$(\lambda \cdot f)(\mathbf{x}) = \lambda f(\mathbf{x}) \ (\lambda \in \mathbb{C});\tag{1.18}$$

$$(f+g)(\mathbf{x}) = f(\mathbf{x}) + \mathbf{g}(\mathbf{x}).\tag{1.19}$$

We use the notation *C*(*X*) with some foresight, anticipating the case where *X* is no longer finite, but in any case, since for the moment it is, every function is continuous. Moreover, the vector space structure on *C*(*X*) may be extended to that of a commutative algebra (where, by convention, all our algebras are associative and are defined over the complex scalars) by defining multiplication pointwisely, too:

$$(f \cdot \mathbf{g})(\mathbf{x}) = f(\mathbf{x})\mathbf{g}(\mathbf{x}).\tag{1.20}$$

Note that this algebra has a unit 1*<sup>X</sup>* , i.e., the function identically equal to 1.

For finite *X*, this structure suffices for *X* to be recovered from *C*(*X*), as follows.

Definition 1.4. *The* Gelfand spectrum Σ(*A*) *of a (complex) algebra A is the set of all nonzero linear maps* ω : *A* → C *that satisfy* ω(*f g*) = ω(*f*)ω(*g*)*.*

These are, of course, precisely the nonzero algebra homomorphisms from *A* to C.

Proposition 1.5. *The Gelfand spectrum* Σ(*C*(*X*)) *is isomorphic (as a set) to X.*

*Proof.* Each *x* ∈ *X* defines a map ω*<sup>x</sup>* : *C*(*X*) → C by ω*x*(*f*) = *f*(*x*). One obviously has ω*<sup>x</sup>* ∈ Σ(*C*(*X*)), so we have a map *X* → Σ(*C*(*X*)), *x* → ω*x*. We show that this map is a bijection. Injectivity is easy: if ω*<sup>x</sup>* = ω*y*, then *f*(*x*) = *f*(*y*) for each *f* ∈ *C*(*X*), so taking *f* = δ*<sup>z</sup>* for each *z* ∈ *X* gives *x* = *y* (here δ*z*(*x*) = δ*xz*). To prove surjectivity, we note that since *C*(*X*) is finite-dimensional as a vector space, with basis (δ*y*)*y*∈*<sup>X</sup>* , each linear functional ω : *C*(*X*) → C takes the form

$$\alpha(f) = \sum\_{\mathbf{x}} \mu(\mathbf{x}) f(\mathbf{x}),\tag{1.21}$$

for some function μ : *X* → C. For ω ∈ Σ(*C*(*X*)), find some *z* ∈ *X* for which μ(*z*) = 0 (this has to exist, as ω = 0). For arbitrary *w* ∈ *X*, imposing ω(δ*w*δ*z*) = ω(δ*w*)ω(δ*z*) enforces μ = δ*<sup>z</sup>* (which also shows that *z* is unique), and hence ω = ω*z*. -

The physically relevant set *R*(*X*) of all real-valued functions on *X* is obviously a real vector space inside *C*(*X*). To recover it algebraically, we equip *C*(*X*) with an *involution*, which on an arbitrary (not necessarily commutative) algebra *A* is defined as an anti-linear anti-homomorphism that squares to id*A*, i.e., a linear map ∗ : *A* → *A* (written *a* → *a*∗) that satisfies (λ*a*)<sup>∗</sup> = λ*a*∗, (*ab*)<sup>∗</sup> = *b*∗*a*∗, and *a*∗∗ = *a*. In our case *A* =*C*(*X*), which is commutative, the latter property simply becomes (*f g*)∗ = *f* ∗*g*∗. In any case, we define this involution by pointwise complex conjugation, i.e.,

$$f^\*(\mathbf{x}) = \overline{f(\mathbf{x})}.\tag{1.22}$$

We evidently recover the real-valued functions in the involutive algebra *C*(*X*) as

$$R(X) \equiv \mathcal{C}(X)\_{\text{sa}} = \{ f \in \mathcal{C}(X) \mid f^\* = f \}. \tag{1.23}$$

Finally, although we do not need this yet, we note that *C*(*X*) has a natural *norm*

$$\|\|f\|\|\_{\infty} = \sup\_{\mathbf{x} \in \mathcal{X}} \{|f(\mathbf{x})|\}. \tag{1.24}$$

These structures turn *C*(*X*) into a *commutative C\*-algebra* (cf. Definition C.1).

Definition 1.6. *The* algebra of observables *of the physical system described by the phase space X is C*(*X*)*, seen as a (commutative)* C\*-algebra *in the above way.*

Thence elements of*C*(*X*) are called *observables*(a term that really should be applied only to its self-adjoint elements, i.e., those satisfying *f* ∗ = *f*).

We have thus equipped the *random variables* on *X* with enough structure to recover *X* itself, and now turn to the other side of the coin, viz. the *probability measures* on *X*. Here the relevant mathematical structure is that of a *compact convex set*, a concept we only need to define in the context of an ambient (real) vector space.

Definition 1.7. *A subset K of a (real or complex) vector space V is called* convex *if the straight line segment between any two points on K lies in K. Expressed formally, this means that whenever v*,*w* ∈ *K and t* ∈ (0,1)*, one has tv*+ (1−*t*)*w* ∈ *K.*

The following probabilistic reformulation of this notion is very useful.

Proposition 1.8. *A set K* ⊂*V is convex iff for any k, given k probabilities* (*t*1,...,*tk*) *(i.e., ti* <sup>≥</sup> <sup>0</sup> *and* <sup>∑</sup>*<sup>i</sup> ti* <sup>=</sup> <sup>1</sup>*) and k points* (*v*1,..., *vk*) *in K, one has* <sup>∑</sup>*<sup>k</sup> <sup>i</sup>*=<sup>1</sup> *ti* · *vi* ∈ *K.*

*Proof.* Taking *k* = 2 recovers Definition 1.7 from its probabilistic version. Conversely, one uses induction on *k*, using the identity (assuming 0 < *tk* < 1):

$$t\_1\mathbf{v}\_1 + \dots + t\_k\mathbf{v}\_k = (1 - t\_k) \left( \frac{t\_1}{1 - t\_k}\mathbf{v}\_1 + \dots + \frac{t\_{k-1}}{1 - t\_k}\mathbf{v}\_{k-1} \right) + t\_k\mathbf{v}\_k. \tag{7}$$

Any linear subspace of *V* is trivially convex, as is any translate thereof (i.e., any *affine* subspace of *V*). Another, much more important example is the *convex hull* co(*S*) of any subset *S* ⊂ *V*; noting that the intersection of any family of convex sets is again convex, co(*S*) may be defined as the intersection of all convex subsets of *V* that contain *S*, or, equivalently, as the smallest convex subset of *V* that contains *S* (whose existence is guaranteed by the previous remark). Proposition 1.8 then yields

$$\text{co}(\mathcal{S}) = \left\{ \sum\_{i=1}^{k} t\_i \cdot \nu\_i \mid k \in \mathbb{N}, (\nu\_1, \dots, \nu\_k) \in \mathcal{S}^k, t\_l \ge 0, \sum\_{i} t\_i = 1 \right\}. \tag{1.25}$$

In particular, if *S* = {*v*1,..., *vk*} is a finite set, then one simply has

$$\text{co}(\{\mathbf{v}\_1, \dots, \mathbf{v}\_k\}) = \left\{ \sum\_{i=1}^k \mathbf{t}\_i \cdot \mathbf{v}\_i \mid \mathbf{t}\_i \ge \mathbf{0}, \sum\_i \mathbf{t}\_i = 1 \right\}. \tag{1.26}$$

The convex hull of any finite set of points in R*n*+<sup>1</sup> is called a *convex polytope*. Such convex sets are closed and bounded (since none of the *ti* ≥ 0 can walk away too far without violating the condition ∑*<sup>i</sup> ti* = 1), and hence are compact. In particular,

$$\Delta\_n = \{ \mathbf{x} \in \mathbb{R}^{n+1} \mid \mathbf{x}\_i \ge \mathbf{0}, \sum\_{i} \mathbf{x}\_i = 1 \} \tag{1.27}$$

is a convex polytope called a *simplex*. For example, Δ<sup>1</sup> is the line segment from (0,1) to (1,0) in R2. We would like to say that Δ<sup>1</sup> is "isomorphic" to the unit interval [0,1], so we define two convex sets *K*1,*K*<sup>2</sup> to be *isomorphic* (as such) if there is a bijection *f* : *K*<sup>1</sup> → *K*<sup>2</sup> that is *affine*, in that for *t* ∈ (0,1) and *v*1, *v*<sup>2</sup> ∈ *K*1, we have

$$tf(t\nu\_1 + (1-t)\nu\_2) = tf(\nu\_1) + (1-t)f(\nu\_2). \tag{1.28}$$

Then the function *f* : Δ<sup>1</sup> → [0,1] given by *f*(λ,1−λ) = λ, where λ ∈ [0,1], will do. Similarly, <sup>Δ</sup><sup>2</sup> <sup>⊂</sup> <sup>R</sup><sup>3</sup> is isomorphic to any equilateral triangle in <sup>R</sup><sup>2</sup> with sides of unit length, whereas Δ<sup>3</sup> is just the tetrahedron (which is one of the five Platonic solids).

There are many other convex polytopes (cf. §B.11), but simplices are of prime importance for us, since Δ*<sup>n</sup>* is isomorphic to the set Pr(*X*) of all probability distributions on a set *X* = {0,...,*n*} with *n*+1 points; the identification Pr(*X*) *p* ↔ *x* ∈ Δ*<sup>n</sup>* is given by *xi* = *p*(*i* + 1). In particular, we see that for any finite set *X*, Pr(*X*) is a compact convex set. This is also clear from Definitions 1.1 and 1.7 (and will even be true for general compact phase spaces *X*, cf. Corollary B.17 and §C.25).

Definition 1.9. *The* state space *of the physical system described by a (finite) space X is the set* Pr(*X*) *of all probability measures on X (or, equivalently, of all probability distributions on X ), seen as a* compact convex set*.*

Thus a probability measure (or distribution) on *X* is often called a *state* (of the physical system described by *X*). The operation of passing from states *P*,*Q* ∈ Pr(*X*) to a new state *tP*+ (1−*t*)*Q* ∈ Pr(*X*), where *t* ∈ (0,1) as usual, or, more generally, from a (finite) family of states (*Pi*) and a set (*ti*) of probabilities (i.e., *ti* ≥ 0 and ∑*<sup>i</sup> ti* = 1) to the convex sum ∑*<sup>i</sup> tiPi*, is called *mixing*.

It is possible to recover *X* from its associated state space Pr(*X*), as follows.

Definition 1.10. *The* (extreme) boundary ∂*eK of a convex set K consists of all points v* ∈ *K satisfying the following condition:*

$$\text{if } \mathbf{v} = t\mathbf{w} + (1 - t)\mathbf{x} \text{ for certain } \mathbf{w}, \mathbf{x} \in \mathbf{K} \text{ and } t \in (0, 1), \text{ then } \mathbf{v} = \mathbf{w} = \mathbf{x}.$$

*Elements v* ∈ ∂*eK of the boundary are called* extreme points *of K.*

We will now compute the boundary of Pr(*X*). The result may be expressed by

$$
\partial\_{\varepsilon} \Delta\_n = \{ \mathbf{e}\_1, \dots, \mathbf{e}\_{n+1} \}, \tag{1.29}
$$

where (e1,....e*n*+1) is the standard basis of R*n*+<sup>1</sup> (i.e., e<sup>1</sup> = (1,0,...,0), etc.). However, we will give a direct probabilistic proof. We already noted the special probability measures *Px*, *x* ∈ *X*. The association *x* → *Px* defines a map from *X* to Pr(*X*).

## Proposition 1.11. *The set X is isomorphic to the boundary* ∂*e*Pr(*X*) *through x* → *Px.*

*Proof.* It is convenient to work with probability distributions *p* rather than probability measures *P*. First, *x* → *px* is trivially injective from *X* to Pr(*X*): if *x* = *y* then *px*(*x*) = 1 whereas *py*(*x*) = 0, so *px* = *py*. Second, *px* ∈ ∂*e*Pr(*X*). For suppose one has *px* = *t p* + (1 − *t*)*q* for some *p*,*q* ∈ Pr(*X*) and *t* ∈ (0,1). Hence *px*(*y*) = *t p*(*y*)+(1−*t*)*q*(*y*). Taking *y* = *x* yields *p*(*y*) = *q*(*y*) = 0, so that *p* = *q* = *px*. Consequently, *X* ⊆ ∂*e*Pr(*X*).

The converse inclusion is (contrapositively) equivalent to the property that for any *p* = *px* (for all *x*), there are *q* and *r*, *q* = *r*, and *t* ∈ (0,1), with *p* = *tq*+ (1−*t*)*r*. Indeed, if *p* = *px*, there is some *x*<sup>0</sup> ∈ *X* with 0 < *p*(*x*0) < 1. Now define *q*, *r*, and *t* by *q*(*x*0) = 1 and *q*(*x*) = 0 for all *x* = *x*0, *r*(*x*0) = 0 and *r*(*x*) = *p*(*x*)/(1− *p*(*x*0)), and finally *t* = *p*(*x*0). Then *p* = *tq*+ (1−*t*)*r* and *q* = *r*. -

The simplest example would be *X* = {0,1}, so that Pr(*X*) ∼= [0,1] by mapping the distribution *p* ∈ Pr(*X*) to *p*(1). Since one may directly verify that ∂*e*[0,1] = {0,1}, under the above isomorphism one therefore has ∂*e*Pr(*X*) ∼= {0,1}. Analogously, ∂*e*(0,1) = 0, so that the boundary of a convex set may apparently be empty. Hence / we see that one remarkable ingredient of Proposition 1.11 lies in the claim that the convex set Pr(*X*) actually *has* a (nonempty) boundary! This is no accident: by the Krein-Milman Theorem (cf. §B.10), this is true for any *compact* convex set (which is consistent with the counterexample just given). For example in quantum mechanics we will encounter the case of *K* = *B*<sup>3</sup> (i.e. the closed unit ball in R3) as the state space of a qubit, whose (extreme) boundary is the two-sphere *S*2, cf. Proposition 2.9. Something similar is true in any dimension, but beware of surprises: if *K* = Δ<sup>2</sup> is an equilateral triangle in the plane, then its *extreme* boundary ∂*eK* consists of the *vertices* of *K* (whereas its *faces* form the *geometric* boundary of the triangle).

The general problem arises whether some point *v* ∈ *K* of a compact convex set *K* may be written as a convex sum (or, more generally, an integral) of extreme points of *K*, and if so, to what extent this *extremal decomposition*

$$\nu = \sum\_{i \in I} t\_i \nu\_i, \ t\_i \ge 0, \sum\_i t\_i = 1, \nu\_i \in \partial\_\epsilon K,\tag{1.30}$$

which for simplicity has been assumed to be a finite sum here, is unique. Without proof, we state a general result of convexity theory, called *Caratheodory's Theorem*:

Theorem 1.12. *If K is a nonempty compact convex subset of* <sup>R</sup>*n, then* <sup>∂</sup>*eK* <sup>=</sup> 0/*, and each point of K is a convex sum of at most n*+1 *points in* ∂*eK.*

If *K* = Δ*n*, then this sum generically has *n*+1 points and is unique. Probabilistically:

Proposition 1.13. *If X is finite, then any probability measure P* ∈ Pr(*X*) *may be written in a unique way as a* finite *mixture of extreme probability measures, viz.*

$$P = \sum\_{\chi \in X} t\_{\chi} P\_{\chi}. \tag{1.31}$$

*Proof.* Take *tx* = *P*({*x*}) in the sense of Definition 1.1, or, equivalently, *tx* = *EP*(δ*x*) in the sense of (1.9). To see that this decomposition is unique, use Proposition 1.11, i.e. ∂*e*Pr(*X*) ∼= *X*, in (1.30) to force *I* = *X* and apply both sides of (1.31) to δ*x*. -

The state space and the algebra of observables may also be defined in terms of each other. We start with the (re)construction of states from observables, where the following definition and proposition may leave a hybrid impression. The rationale behind our approach is that for many purposes it is easier to work with the *complex* algebra *C*(*X*), but on the other hand, compact convex sets are most naturally defined in terms of *real* vector spaces. Fortunately, it is easy to switch between the two: we already know how to obtain the real part *R*(*X*) from *C*(*X*), see (1.23), and conversely, *C*(*X*) is simply the complexification of the real vector space *R*(*X*).

Definition 1.14. *A* state *on C*(*X*) *is a linear map* ω : *C*(*X*) → C *that satisfies:*

*1.* <sup>ω</sup>(*<sup>f</sup>* <sup>2</sup>) <sup>≥</sup> <sup>0</sup> *for each f* <sup>∈</sup> *<sup>C</sup>*(*X*) *with f* <sup>∗</sup> <sup>=</sup> *f (*positivity*); 2.* ω(1*<sup>X</sup>* ) = 1 *(*normalization*).*

The first condition obviously comes down to ω(*f*) ≥ 0 whenever *f* ≥ 0 pointwise.

Equivalently, we may define a state on *R*(*X*) as a real-linear map ω<sup>R</sup> : *R*(*X*) → R that satisfies the very same conditions. Indeed, a state ω<sup>R</sup> on *R*(*X*) defines a complex-linear map ω : *C*(*X*) → C by ω(*f* + *ig*) = ωR(*f*) + *i*ωR(*g*), where *f*,*g* ∈ *R*(*X*). This map satisfies the same conditions of positivity and normalization. Conversely, ω may be restricted to the real part *R*(*X*) of *C*(*X*), so that there is no real (sic) difference between ω and ωR. Hence we will use these interchangeably, often even dropping the suffix R on ω. One advantage of this ability to switch is that a state ω on *C*(*X*) may be regarded as an element of the *real* vector space *R*(*X*)∗. Doing so shows that the terminology of Definitions 1.9 and 1.14 is consistent:

Theorem 1.15. *There is a bijective correspondence between states* ω *on C*(*X*) *and probability measures P on X, given by* ω ↔ *EP, cf.* (1.9) *and* (1.11)*. Therefore, as a subset of the (real) vector space R*(*X*)∗ *of all (real-) linear maps from R*(*X*) *to* R*, the set S*(*C*(*X*)) *of all states on C*(*X*) *coincides with the set* Pr(*X*) *of all probability measures on X. In particular, the state space S*(*C*(*X*)) *of C*(*X*) *is a compact convex set in R*(*X*)∗ *(as a finite-dimensional vector space with its usual topology).*

*Proof.* Given a state ω, define a function *p* : *X* → R by *p*(*x*) = ω(δ*x*). Since δ*<sup>x</sup>* ≥ 0 pointwise, positivity of ω yields *p*(*x*) ≥ 0. Noting that 1*<sup>X</sup>* = ∑*<sup>x</sup>* δ*x*, normalization then forces ∑*<sup>x</sup> p*(*x*) = 1, so that *p* is a probability distribution on *X*. Hence *P* ∈ Pr(*X*), where *P* is the probability measure corresponding to *p*. Conversely, *P* ∈ Pr(*X*) defines a map *EP* : *R*(*X*) → R by (1.9), which is positive and normalized. Note that compactness and convexity of the set *S*(*C*(*X*)) in *R*(*X*)∗ follow directly from its definition, i.e., even without knowing that it equals Pr(*X*). -

Consequently, we may refer to *S*(*C*(*X*)) as *the state space* of *C*(*X*) without any ambiguity, and we will always regard state spaces of (unital) C\*-algebras *A* (cf. Appendix C) as compact convex sets *S*(*A*), where in the present case *A* = *C*(*X*).

#### 1.3 Pure states and transition probabilities

For any C\*-algebra *A* (with unit), and hence in particular for *A* = *C*(*X*), elements of the boundary ∂*eS*(*A*) are called *pure states*, and we call

$$P(A) \equiv \partial\_e S(A) \tag{1.32}$$

the *pure state space* of *A*. States that are not pure are called *mixed*.

Theorem 1.16. *One has P*(*C*(*X*)) ∼= *X, in that the following map is an isomorphism:*

$$X \to P(C(X)), \ x \mapsto \mathfrak{a}\_{\mathfrak{x}}, \ \mathfrak{a}\_{\mathfrak{x}}(f) = f(\mathfrak{x}). \tag{1.33}$$

*Proof.* Combine Proposition 1.11 and Theorem 1.15. -

For finite *X* this isomorphism is merely meant as a bijection between sets (and for general compact Hausdorff spaces *X* it will be a homeomorphism of topological spaces), but we will now introduce some additional structure on pure state spaces that will enrich Theorem 1.16 to an isomorphism of so-called *sets with a transition probability*. This will be necessary in order to reconstruct the observables from the pure states, but it also clarifies the general probabilistic structure of physics (note that the following definition is unusual in probability theory!).

Definition 1.17. *1. A* transition probability *on a set X is a function*

$$
\pi: X \times X \to [0, 1] \tag{1.34}
$$

$$\text{that satisfies } \mathsf{\tau}(\mathbf{x}, \mathbf{y}) = 1 \text{ iff } \mathbf{x} = \mathbf{y} \text{ and } \mathsf{\tau}(\mathbf{x}, \mathbf{y}) = \mathsf{\tau}(\mathbf{y}, \mathbf{x}) \text{ (symmetry)}.$$

The simplest example of a transition probability (on any set *X*) is obviously

$$
\pi(\mathbf{x}, \mathbf{y}) = \delta\_{\mathbf{x}\mathbf{y}}.\tag{1.35}
$$

The point is that this transition probability may be derived from the classical C\* algebra of observables *C*(*X*) by the following formula (assuming *X* finite):

$$\delta\_{\mathbf{x}\mathbf{y}} = \inf \{ f(\mathbf{x}) \mid f \in \mathcal{C}(X), 0 \le f \le 1\_X, f(\mathbf{y}) = 1 \}. \tag{1.36}$$

Indeed, for *x* = *y* this is a tautology, whereas for *x* = *y* the infimum (which is zero) is attained by *f* = δ*y*. In terms of the pure state space *P*(*C*(*X*)), which is *isomorphic* to but not *equal* to *X*, cf. Theorem 1.16, this formula may be written as

$$\mathfrak{d}\_{\mathfrak{x}\mathfrak{y}} = \inf \{ \mathfrak{w}\_{\mathfrak{x}}(f) \mid f \in \mathcal{C}(X), 0 \le f \le 1\_{\mathcal{C}(X)}, \mathfrak{w}\_{\mathfrak{y}}(f) = 1 \}. \tag{1.37}$$

Furthermore (and this is the *real* point, so that we already have to mention it here, ahead of a more detailed treatment in the context of quantum mechanics), the righthand side of (1.37) may be generalized to any finite-dimensional C\*-algebra *A* by

$$\pi^A(a\bullet,\mathfrak{o}') = \inf\{\mathfrak{o}(a) \mid a \in A, 0 \le a \le 1\_A, \mathfrak{o}'(a) = 1\},\tag{1.38}$$

where ω,ω ∈ *P*(*A*). Since (1.38) clearly generalizes (1.37), for *A* = *C*(*X*) we have

$$
\pi^{\mathcal{C}(X)}(a\mathfrak{o}\_{\mathfrak{x}}, a\mathfrak{o}\_{\mathfrak{y}}) = \mathfrak{d}\_{\mathfrak{x}\mathfrak{y}}.\tag{1.39}
$$

Note that the symmetry property in Definition 1.17 is not obvious from (1.38), but in the classical case *A* = *C*(*C*) it is true by computation, and the same will hold in quantum theory. To motivate these definitions, we recall that *f* in (1.37), and likewise *a* in (1.38), are yes-no question to the system, so that the transition probability τ*A*(ω,ω ) monitors to what extent the states ω and ω may be sharply distinguished by asking such questions. If they can, there should be some question *a* for which ω (*a*) = 1 and ω(*a*) = 0, so that τ*A*(ω,ω ) (if ω = ω , of course). As we have seen, in the classical case this can always be done. However, we shall see this is no longer the case in quantum mechanics, where pure states may be thus distinguished iff they correspond to orthogonal unit vectors in Hilbert space. Further motivation for the expression (1.38) is *post hoc*, as it turns out to allow a reconstruction of the vector space of observables *A*, supplemented by the part of its algebraic structure that determines its logical and probabilistic structure (viz. the ability to form squares, *<sup>a</sup>* → *<sup>a</sup>*2) from *<sup>P</sup>*(*A*) *with its associated transition probability*. See Theorem C.179.

First, we develop some theory that puts both classical and quantum mechanics into a more general setting. Notwithstanding the formal incorporation of the former, the underlying Hilbert space thinking will be obvious throughout.

#### Definition 1.18. *Let* (*X*, τ) *be a set with a transition probability.*


$$\sum\_{\mu \in B} \pi(x, \mu) = 1. \tag{1.40}$$

*A basis of a subset S* ⊂ *X is an orthonormal family B* ⊂ *S such that* (1.40) *holds for each x* ∈ *S. Relative to such a basis B of S, we define* τ*<sup>S</sup>* : *X* → R *by*

$$\pi\_{\mathcal{S}}(\mathbf{x}) = \sum\_{\mu \in B} \pi(\mathbf{x}, \mu). \tag{1.41}$$

*As a special case, for S* = {*u*} *we write* τ{*u*} ≡ τ*u, so that*

$$
\pi\_u(\mathbf{x}) = \pi(\mathbf{x}, \boldsymbol{\mu}).\tag{1.42}
$$

*3. The* orthocomplement *S*<sup>⊥</sup> *of some subset S* ⊂ *X is defined as*

$$\mathcal{S}^{\perp} = \{ \mathbf{y} \in \mathcal{X} \mid \mathfrak{r}(\mathbf{x}, \mathbf{y}) = \mathbf{0} \,\forall \mathbf{x} \in \mathcal{S} \}. \tag{1.43}$$


*6. An* observable *for the pair* (*X*, τ) *is a bounded function f* : *X* → R *of the form*

$$f = \sum\_{i} c\_{i} \cdot \pi\_{\mathfrak{y}\_{l}}, \ c\_{l} \in \mathbb{R}, \, \mathfrak{y}\_{l} \in X. \tag{1.44}$$

*The real vector space of such observables is called* -<sup>∞</sup>(*X*, τ)*. 7. A* spectral resolution *of an observable f* ∈ -<sup>∞</sup>(*X*, τ) *is a decomposition*

$$f = \sum\_{\lambda} \lambda \cdot \pi\_{\mathbb{S}\_{\lambda}},\tag{1.45}$$

*where* (*S*<sup>λ</sup> )<sup>λ</sup> *is a resolution of the identity and each* λ ∈ R *occurs at most once.*

In the present section *X* is finite, whilst in the following section on quantum mechanics on finite-dimensional Hilbert spaces at least all bases will be finite, so that there are no convergence issues. In general, *B* may be infinite, in which case (1.40) is defined as the least upper bound of all finite partial sums, and all sums in Definition 1.18 are defined pointwise (i.e., in *x*). In that case, eq. (1.45) may need to be adapted through limit constructions. Furthermore, one may worry about the basisdependence of τ*<sup>S</sup>* in (1.41), but fortunately it turns out that in all sets with a transition probability that arise as pure state spaces defined by C\*-algebras according to (1.38), the function τ*<sup>S</sup>* is independent of the basis *B* whenever *S* is orthoclosed. In that case, spectral resolutions exists and are unique, and one may turn the real vector space -<sup>∞</sup>(*X*, <sup>τ</sup>) of part 6 into a *Jordan algebra* by defining a product ◦ through

$$f^2 = \sum\_{\lambda} \lambda^2 \cdot \pi\_{\mathbb{S}\_{\lambda}};\tag{1.46}$$

$$f \circ \mathbf{g} = \frac{1}{4}((f+\mathbf{g})^2 - (f-\mathbf{g})^2). \tag{1.47}$$

In the classical case this yields the pointwise product (1.20), whereas in quantum mechanics it recovers the anti-commutator. Both are examples of *Jordan products* (cf. §C.25), i.e., commutative products ◦ satisfying the curious axiom (C.619).

All this trivializes if τ = τ*C*(*X*) is given by (1.35), where *X* need not even be finite:


$$\ell^{\infty}(X,\pi) = R(X) \equiv C(X,\mathbb{R});\tag{1.48}$$

7. The spectral resolution (1.45) of *f* is given (analogously to operator theory) by

$$f = \sum\_{\lambda \in \sigma(f)} \lambda \cdot \pi\_{f=\lambda},\tag{1.49}$$

cf. Definition 1.1.5. In particular, spectral resolutions in (1.48) are unique.

#### 1.4 The logic of classical mechanics

Whatever one's route to *C*(*X*,R) as the algebra of observables, i.e. either as a starting point or as a derived concept as in (1.48), it determines the logical structure of classical mechanics (we here restrict ourselves to propositional logic). According to the general scheme reviewed in §D.2, apart from the usual logical connectives ¬, ∧, ∨, and → for *not*, *and*, *or*, and *implies*, a propositional theory needs a set Σ*<sup>X</sup>* of *atomic propositions*. These are provided by *C*(*X*,R), and Σ*<sup>X</sup>* consist of all expressions *f* ∈ Δ (we expect no confusion between this notation for both *propositions* in logic and *events* in probability theory), where *f* : *X* → R is a function, and Δ is some subset of R. As we shall see, *f* ∈ Δ is always false if Δ ∩σ(*f*) = 0, so we might / as well assume that Δ ⊆ σ(*f*). We write *f* = λ for *f* ∈ {λ}. From these elementary propositions, propositions are constructed inductively using the iterative rules of propositional logic (see §D.2). This produces a set *BX* ≡ *B*Σ*<sup>X</sup>* of propositions.

Of course, there are logical relations between our atomic propositions (and hence between elements of *BX* ). For example, if Δ ⊂ Δ , then *f* ∈ Δ should imply *f* ∈ Δ . Such relations may be formulated as axioms of some propositional theory T*<sup>X</sup>* describing the logic of classical mechanics. These axioms take the following form:

$$f(f \in \Gamma) \to (\mathfrak{g} \in \Delta) \text{ iff } f^{-1}(\Gamma) \subseteq \mathfrak{g}^{-1}(\Delta). \tag{1.50}$$

This may also be formulated through the notion of *semantic entailment*. For each *x* ∈ *X*, we define a valuation *Vx* : Σ*<sup>X</sup>* → {0,1} (cf. §D.2) by

$$V\_{\mathbf{x}}(f \in \Delta) = 1 \text{ iff } f(\mathbf{x}) \in \Delta,\tag{1.51}$$

extended to a map *Vx* : *BX* → {0,1} through the recursive use of truth tables. Defining the semantic entailment relation |=*<sup>X</sup>* on *BX* by α |=*<sup>X</sup>* β iff *Vx*(α) = 1 implies *Vx*(β) = 1 for all *x* ∈ *X*, it is easy to see that α → β as defined in (1.50) iff α |=*<sup>X</sup>* β.

In order to compute the ensuing Lindenbaum algebra *LX* ≡ *L*Σ*<sup>X</sup>* , we note that

$$(f \in \Gamma) \leftrightarrow (\mathbf{g} \in \Delta) \text{ iff } f^{-1}(\Gamma) = \mathbf{g}^{-1}(\Delta). \tag{1.52}$$

Writing ∼*<sup>X</sup>* for ∼T*<sup>X</sup>* (which is the equivalence relation given by |=*<sup>X</sup>* , too), we find

$$(f \in \Delta) \sim\_X (1\_{f^{-1}(\Delta)} = 1),\tag{1.53}$$

where we recall that 1*<sup>A</sup>* is the characteristic (or indicator) function of *A*. Using the truth tables for <sup>∧</sup> and for <sup>¬</sup>, we also obtain (in terms of the complement <sup>Δ</sup>*<sup>c</sup>* <sup>=</sup> <sup>R</sup>\Δ):

$$(f \in \Gamma) \land (\mathfrak{g} \in \Delta) \sim\_X (1\_{f^{-1}(\Gamma) \cap \mathfrak{g}^{-1}(\Delta)} = 1);\tag{1.54}$$

$$(\neg f \in \Delta) \sim\_X (f \in \Delta^c) \sim\_X (1\_{f^{-1}(\Delta^c)} = 1). \tag{1.55}$$

Finally, the truth tables yield logical (and hence semantic) equivalences like

$$
\alpha \lor \beta \sim\_X \neg(\neg \alpha \land \neg \beta). \tag{1.56}
$$

Combining the specific and the general equivalences (1.53) - (1.56), we have:

Lemma 1.19. *Any proposition in BX is logically (and semantically) equivalent (relative to X ) to one of the form* 1*<sup>U</sup>* = 1*, for some event U* ⊂ *X. Furthermore,*

$$(\neg 1\_U = 1) \sim\_X (1\_{U^c} = 1);\tag{1.57}$$

$$(1\_U = 1) \land (1\_V = 1) \sim\_X (1\_{U \cap V} = 1);\tag{1.58}$$

$$(1\_U = 1) \vee (1\_V = 1) \sim\_X (1\_{U \cup V} = 1). \tag{1.59}$$

Theorem 1.20. *The Lindenbaum algebra LX is isomorphic (as a Boolean algebra) to the power set* P(*X*) *of X under the map* ϕ : *LX* → P(*X*) *induced by*

$$\mathfrak{q}([f \in \Delta]\_X) = f^{-1}(\Delta). \tag{1.60}$$

*In particular, the logical connectives* ¬*,* ∧ *and* ∨ *(descended to LX ) turn into settheoretic complementation* (−)*c, intersection* <sup>∩</sup>*, and union* <sup>∪</sup>*, respectively, in that*

$$\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\tau}}}}}}([\neg \mathfrak{a}]\_{X}) = \mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\omega}}}}}}}}}}}}}}}}}}}}}}} } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } } \right) } } } \} } \} } \} } \} } \} } \} } \} } \} } \} } \} } \} \} } \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \} \}$$

$$\mathfrak{\mathfrak{q}}([\alpha \wedge \beta]\_X) = \mathfrak{\mathfrak{q}}([\alpha]\_X) \cap \mathfrak{\mathfrak{q}}([\beta]\_X) \,\tag{1.62}$$

ϕ([α ∨β]*<sup>X</sup>* ) = ϕ([α]*<sup>X</sup>* )∪ϕ([β]*<sup>X</sup>* ), (1.63)

*and* ϕ *maps the partial order* ≤ *on LX into set-theoretic inclusion* ⊆*, i.e.,*

$$
\sigma[\alpha]\_X \le [\beta]\_X \text{ iff } \mathfrak{g}([\alpha]\_X) \subseteq \mathfrak{g}([\beta]\_X). \tag{1.64}
$$

This is immediate from Lemma 1.19. Interestingly, the *Boolean* algebra structure just derived as the governor of the (propositional) logic of classical mechanics may be reformulated in terms of the *Jordan* algebraic structure (1.46) - (1.47) of -<sup>∞</sup>(*X*τ), or, when *X* is finite, of the C\*-algebra of observables *C*(*X*) itself:


$$e \le f \text{ iff } e \circ f = e;\tag{1.65}$$

$$
\neg e = 1\_X - e;\tag{1.66}
$$

$$e \wedge f = e \circ f;\tag{1.67}$$

$$e \lor f = e + f - e \diamond f. \tag{1.68}$$

Indeed, in this case ◦ is pointwise multiplication (1.20). Using 1*<sup>U</sup>* · 1*<sup>V</sup>* = 1*U*∩*<sup>V</sup>* yields (1.67), (1.65) comes down to *U* ⊆*V* iff *U* ∩*V* =*U*, (1.66) is 1*<sup>X</sup>* −1*<sup>U</sup>* = 1*U<sup>c</sup>* , and (1.68) follows by writing its right-hand side as 1*<sup>X</sup>* −(1*<sup>X</sup>* −*e*)∧(1*<sup>X</sup>* − *f*).

#### 1.5 The GNS-construction for *C*(*X*)

As a bridge from classical to quantum mechanics (as well as a good exercise), we finally inject some Hilbert space theory into classical physics by discussing the GNS*construction* of C\*-algebra theory for the special case of *C*(*X*), where *X* remains finite. In general, for each state ω on a C\*-algebra *A*, the GNS-construction canonically yields a *Hilbert space H*<sup>ω</sup> (which is finite-dimensional for *A* = *C*(*X*) with finite *X*) and a *representation* of *A* on *H*ω, in the sense of a (complex) linear map

$$
\pi\_{\mathfrak{w}} : A \to B(H\_{\mathfrak{w}}) \tag{1.69}
$$

that satisfies

$$
\pi\_{\mathfrak{o}}(ab) = \pi\_{\mathfrak{o}}(a)\pi\_{\mathfrak{o}}(b);\tag{1.70}
$$

$$
\pi\_{\mathfrak{w}}(a^\*) = \pi\_{\mathfrak{w}}(a)^\*.\tag{1.71}
$$

Furthermore, *H*<sup>ω</sup> contains a special *unit vector* Ωω that is *cyclic* for πω in that

$$
\pi\_{\mathfrak{a}0}(A)\mathfrak{Q}\_{\mathfrak{a}0} \equiv \{\pi\_{\mathfrak{a}0}(a)\mathfrak{Q}\_{\mathfrak{a}0}, a \in A\} = H\_{\mathfrak{a}0},\tag{1.72}
$$

at least in the relevant case where dim(*H*ω) < ∞; otherwise, the left-hand side is merely dense in *H*<sup>ω</sup> and one needs to take the (norm) closure to obtain *H*ω. Furthermore, Ωω realizes the state ω as a quantum-mechanical expectation value by

$$
\Delta \phi(a) = \langle \Omega\_{a0}, \pi\_{a0}(a) \Omega\_{a0} \rangle\_{H\_{a0}}.\tag{1.73}
$$

Given ω ∈ *S*(*A*), the GNS-construction starts with the vector spaces

$$N\_{\mathfrak{o}\mathfrak{o}} = \{ a \in A \mid \mathfrak{o}(a^\*a) = 0 \};\tag{1.74}$$

$$H\_{\mathfrak{O}} = A / N\_{\mathfrak{O}}.\tag{1.75}$$

Now, if *b* ∈ *N*<sup>ω</sup> and *a* ∈ *A*, then *ab* ∈ *N*ω, because of the important inequality

$$
\mathfrak{so}(b^\*a^\*ab) \le ||a||^2 \mathfrak{so}(b^\*b). \tag{1.76}
$$

This is true for any C\*-algebra *A*, but below we prove it only for our example. Assuming (1.76) for the moment, the action of *A* on itself by left multiplication descends to a well-defined action on *H*ω, which we call πω. In other words, if *b*<sup>ω</sup> ∈ *H*<sup>ω</sup> is the image of *b* ∈ *A* under the canonical projection *A* → *A*/*N*ω, then

$$
\pi\_{\mathfrak{o}}(a)b\_{\mathfrak{o}} = (ab)\_{\mathfrak{o}}.\tag{1.77}
$$

Crucially, this vector space *H*<sup>ω</sup> is equipped with a canonical inner product

$$
\langle a\_{\mathfrak{o}\mathfrak{o}}, b\_{\mathfrak{o}\mathfrak{o}} \rangle = \mathfrak{o}(a^\*b). \tag{1.78}
$$

Indeed, this form is well defined, and is positive definite because ω is a state.

#### 1.5 The GNS-construction for *C*(*X*) 37

In general, *H*<sup>ω</sup> as defined by (1.75) with inner product (1.78) is merely a pre-Hilbert space, which needs to be completed in the associated norm, and it takes some effort to check that the operators defined by (1.77) are bounded. In our example, on the other hand, *H*<sup>ω</sup> is finite-dimensional and hence complete. In any case, it is easy to verify the properties (1.70) - (1.73), whilst (1.72) holds with the unit 1 = 1*H*.

We now prove (1.76) for *A* = *C*(*X*). Fom Theorem 1.15 we have ω = *EP*, and by (1.9) and (1.24), the inequality (1.76) comes down to the obviously correct result

$$\sum\_{\mathbf{x}} |f(\mathbf{x})g(\mathbf{x})|^2 \le \|f\|\_{\infty}^2 \sum\_{\mathbf{x}} |g(\mathbf{x})|^2. \tag{1.79}$$

Writing *NEP* ≡ *NP*, we may also check directly that if *g* ∈ *NP* and *f* ∈ *C*(*X*), then *f g* ∈ *NP*. Indeed, in terms of the set supp(*P*) ⊆ *X* defined by

$$\text{supp}\,(P) = \{x \in X \mid p(x) > 0\},\tag{1.80}$$

we have

$$N\_P = \{ f \in C(X) \mid f(\mathbf{x}) = \mathbf{0} \,\forall \mathbf{x} \in \text{supp}\,(P) \},\tag{1.81}$$

and clearly *g* = 0 on supp(*P*) implies *f g* = 0 on supp(*P*). We now compute *HP* and π*P*. From (1.81) we have *f* − *g* ∈ *NP* and hence *f* ∼ *g* iff *f*(*x*) = *g*(*x*) for all *x* ∈ supp(*P*), where ∼ is the equivalence relation whose equivalence classes *fP* define elements of *HP* = *C*(*X*)/*NP*. Hence *fP* is simply the restriction of *f* to supp(*P*), and

$$H\_P = \ell^2(X, P) \tag{1.82}$$

is the Hilbert space that consists of these restriction, with inner product

$$
\langle f\_P, \mathbf{g}\_P \rangle = \sum\_{\mathbf{x} \in \text{supp}(P)} p(\mathbf{x}) \overline{f(\mathbf{x})} \mathbf{g}(\mathbf{x}). \tag{1.83}
$$

The representation (1.77) then trivially gives

$$
\mathfrak{m}\_P(f)\mathfrak{g}\_P = f\_P\mathfrak{g}\_P,\tag{1.84}
$$

so that π*P*(*f*) is the *multiplication operator* defined by *f* on -<sup>2</sup>(*X*,*P*). In functional analysis one often denotes elements *gP* ∈ -<sup>2</sup>(*X*,*P*) by the functions *g* themselves, and similarly writes π*P*(*f*) as *f* , so that (1.84) simply reads π*P*(*f*)*g* = *f g*.

The operator norm of π*P*(*f*) is easily computed to be

$$\|\|\pi\_P(f)\|\| = \sup\{|f(\mathbf{x})|, \mathbf{x} \in \text{supp}(P)\} = \|f\_{|\text{supp}(P)}\|\_{\infty}.\tag{1.85}$$

Indeed, the bound π*P*(*f*)≤ *f*|supp(*P*)<sup>∞</sup> is immediate from the definition

$$\|\|\pi\_P(f)\|\| = \sup\{ \|\|\pi\_P(f)g\_P\|\|, \mathcal{g}\_P \in H\_P, \|\|g\_P\|\| = 1 \},\tag{1.86}$$

and equality in this bound follows from applying the operator π*P*(*f*) to the function *g* = 1*<sup>U</sup>* , where *U* ⊂ *X* is any set where | *f* | attains its maximum *f*|supp(*P*)∞.

#### Notes

## §1.1. Basic constructions of probability theory

## §1.2. Classical observables and states

For (advanced) treatments of convexity theory and probability theory in contexts relevant to mathematical physics we recommend Israel (1979), Alfsen & Shultz (2001), and Simon (2001).

#### §1.3. Pure states and transition probabilities

Transition probabilities (in the abstract sense meant here) were introduced by von Neumann, but his manuscript from 1937 was only published in 1981 as von Neumann (1981/1937). This remarkable paper has remained largely unused (or even unknown) in both mathematical physics and operator algebras; Mielnik (1968), Shultz (1982), and Landsman (1996, 1997) are exceptions. An extensive discussion with further references may be found in Landsman (1998a).

## §1.4. The logic of classical mechanics

Unless one counts Boole (1847), it seems that the logical analysis of classical mechanics was initiated by the famous paper of Birkhoff & von Neumann (1936), which was primarily concerned with quantum logic (cf. §2.10). Our use of semantic implication (also in the quantum case) was inspired by Redei (1998). ´

#### §1.5. The GNS-construction for *C*(*X*)

See §C.12 for the GNS-construction in general.

## Chapter 2 Quantum mechanics on a finite-dimensional Hilbert space

The quantum analogue of a finite set *X* (in its role as a configuration space in classical mechanics) is the finite-dimensional Hilbert space -<sup>2</sup>(*X*), by which we mean the vector space of functions ψ : *X* → C, equipped with the inner product

$$
\langle \Psi | \Psi \rangle = \sum\_{\mathbf{x} \in X} \overline{\Psi(\mathbf{x})} \Phi(\mathbf{x}). \tag{2.1}
$$

There is no issue of convergence here, but later on we will use the same notation for infinite sets *X*, where -<sup>2</sup>(*X*) is restricted to those functions (i.e. sequences) for which ∑*x*∈*<sup>X</sup>* |ψ(*x*)| <sup>2</sup> < ∞ (which also guarantees convergence of the sum in (2.1)).

If *X* ∼= *n* as sets (i.e., |*X*| = *n*), we have a unitary isomorphism of Hilbert spaces

$$\ell^2(\underline{n}) \cong \mathbb{C}^n,\tag{2.2}$$

through the map <sup>ψ</sup> → (ψ(1),...,ψ(*n*)), where <sup>C</sup>*<sup>n</sup>* has the standard inner product. *w*,*z* = ∑*iwizi*. In particular, the function δ*<sup>k</sup>* ∈ -<sup>2</sup>(*n*), defined by δ*k*(*l*) = δ*kl*, is mapped to the *<sup>k</sup>*'th standard basis vector *uk* ≡ |*k* of <sup>C</sup>*n*, i.e., *<sup>u</sup>*<sup>1</sup> = (1,0,...,0), etc. In the special case *<sup>X</sup>* <sup>=</sup> *<sup>N</sup>*<sup>Λ</sup> considered in Chapter 1, we have <sup>|</sup>*X*<sup>|</sup> <sup>=</sup> *<sup>N</sup>*|Λ<sup>|</sup> and hence

$$\ell^2(\underline{\mathsf{N}}^{\Lambda}) \cong \mathbb{C}^{\binom{|\mathsf{N}^{|\Lambda|}}{}} \cong \left(\mathbb{C}^{\mathsf{N}}\right)^{\otimes |\Lambda|} = \bigotimes\_{\mathsf{n} \in \Lambda} \mathbb{C}^{\mathsf{N}}\_{\mathsf{n}} \equiv \bigotimes\_{\Lambda} \mathbb{C}^{\mathsf{N}},\tag{2.3}$$

where C*<sup>N</sup>* <sup>n</sup> <sup>=</sup> <sup>C</sup>*<sup>N</sup>* for each <sup>n</sup> <sup>∈</sup> <sup>Λ</sup>, so that the suffix <sup>n</sup> merely labels which copy of <sup>C</sup>*<sup>N</sup>* is meant (see §C.13 for tensor products of Hilbert spaces). Explicitly, a canonical unitary isomorphism -<sup>2</sup>(*N*<sup>Λ</sup> ) <sup>→</sup> <sup>Λ</sup> C*<sup>N</sup>* is given by linear extension of the map

$$
\delta\_\mathbf{x} \mapsto \bigotimes\_{\mathbf{n} \in \Lambda} \mu\_\mathbf{x}(\mathbf{n})\,,\tag{2.4}
$$

where *<sup>x</sup>* :<sup>Λ</sup> <sup>→</sup> *<sup>N</sup>* and hence *ux*(n) <sup>∈</sup> <sup>C</sup>*N*. Thus elements of the tensor product <sup>Λ</sup> C*<sup>N</sup>* may be seen as wave-functions on spin configuration space (and *vice versa*). In particular, elementary tensor products of basis vectors in <sup>Λ</sup> C*<sup>M</sup>* correspond to wavefunctions in -<sup>2</sup>(*M*<sup>Λ</sup> ) that are δ-peaked at some 'classical' spin configuration.

#### 2.1 Quantum probability theory and the Born rule

In preparation for this chapter, the reader would do well to review Appendix A.

The probabilistic setting of quantum mechanics is given by the following counterpart of Definition 1.1 (from which conditional probabilities are lacking, though).

Definition 2.1. *Let H be a finite-dimensional Hilbert space.*


$$\text{Tr}(\mathfrak{p}) = 1.\tag{2.5}$$

*We denote the set of all density operators on H by* D(*H*)*.*


Being positive, a density matrix ρ is self-adjoint, so by Theorem A.10, notably (A.40), and Definition 2.1.2 we have

$$\mathfrak{p} = \sum\_{i} p\_{i} |\mathfrak{v}\_{i}\rangle\langle\mathfrak{v}\_{i}|, \ p\_{i} > 0,\\ \sum\_{i} p\_{i} = 1,\tag{2.6}$$

where the (υ*i*) form an orthonormal set in *H* and |υ*i*υ*i*| is the (orthogonal) projection on the one-dimensional subspace C·υ*i*. As in the classical case, one special class of density operators and one special class of random variables stand out:

• Each *unit vector* ψ ∈ *H* defines a density operator

$$
\rho\_{\Psi} \equiv e\_{\Psi} = |\Psi\rangle\langle\Psi|,\tag{2.7}
$$

i.e., the (orthogonal) projection *e*<sup>ψ</sup> on the one-dimensional subspace C · ψ. A basis (which by convention always means an *orthonormal* basis) of eigenvectors of ρψ consists of υ<sup>1</sup> = ψ itself, supplemented by any basis (υ2,...,υdim(*<sup>H</sup>*)) of the orthogonal complement of C·ψ. The corresponding probabilities in (2.6) are evidently *p*<sup>1</sup> = 1 and *pi* = 0 for all *i* > 1.

• Each *quantum event L* ⊂ *H* defines the corresponding projection *eL* (which is self-adjoint, i.e. a random variable): If (υ*j*) is a basis of *L*, then *eL* = ∑*<sup>j</sup>* |υ*j*υ*j*|. If *L* = *H* then *eL* = 1 with σ(*eL*) = {1}. If *L* = {0} then *eL* = 0 with σ(*eL*) = {0}. In all other cases, i.e. for proper subspaces *L*, one has σ(*eL*) = {0,1}.

Conversely, any self-adjoint operator *a* with spectrum σ(*a*) ⊆ {0,1} is given by *a* = *eL* for some subspace *L* ⊆ *H*; just take *L* = {ψ ∈ *H* | *a*ψ = 1}. Such operators correspond to yes-no questions to the system and lie at the basis of the logical interpretation of quantum theory due to Birkhoff and von Neumann; see §2.10.

The following quantum analogue of Theorem 1.2 is based on Theorem A.10.

Theorem 2.2. *A density operator* ρ *on H and a self-adjoint operator a* : *H* → *H jointly yield a probability distribution pa on the spectrum* σ(*a*) *by the* Born rule

$$p\_a(\mathbb{X}) = \text{Tr}\,(\rho e\_{\mathbb{X}}).\tag{2.8}$$

*The associated probability measure Pa is given at* Δ ⊆ σ(*a*) *by (cf.* (A.42)*)*

$$P\_a(\Delta) = \text{Tr}\,(\rho e\_\Delta). \tag{2.9}$$

*Proof.* Positivity of the numbers *pa*(λ) follows by taking the trace over a basis of eigenvectors υ*<sup>i</sup>* of ρ, with corresponding eigenvalues *pi* ≥ 0. This yields

$$\operatorname{Tr}(\rho e\_{\lambda}) = \sum\_{i} p\_i \left\| e\_{\lambda} \,\nu\_i \right\|^2 \ge 0.$$

Eqs. (A.38) and (2.5) then give ∑<sup>λ</sup> *pa*(λ) = 1. Eq. (2.8) follows from the equality *Pa*(Δ) = ∑λ∈<sup>Δ</sup> *pa*(Δ), cf. (1.2), and (A.42). -

In particular, if ρ = ρψ, writing *p* ψ *<sup>a</sup>* for the associated probability, (2.8) yields

$$p\_a^{\Psi}(\lambda) = \langle \Psi, e\_{\lambda}\Psi \rangle = ||e\_{\lambda}\Psi||^2. \tag{2.10}$$

If in addition λ ∈ σ(*a*) is non-degenerate, so that *e*<sup>λ</sup> = |υλ υλ | for some unit vector υλ with *a*υλ = λ υλ , then the Born rule (2.9) assumes its original form

$$p\_a^{\Psi}(\lambda) = |\langle \Psi, \mathfrak{v}\_{\lambda} \rangle|^2. \tag{2.11}$$

Specializing (2.10) to the random variable *a* = *eL* defined by an event *L* ⊂ *H* yields

$$p\_{e\_L}^{\Psi}(1) = \|e\_L \Psi\|^2. \tag{2.12}$$

If *L* = C·ϕ is one-dimensional, too, in which case we write *p* ψ *<sup>e</sup>*<sup>ϕ</sup> ≡ *p* ψ <sup>ϕ</sup> , we have

$$p\_{\boldsymbol{\Phi}}^{\boldsymbol{\Psi}}(1) = |\langle \boldsymbol{\Psi}, \boldsymbol{\Phi} \rangle|^2;\tag{2.13}$$

note the following equality of probability distributions on σ(*e*ϕ) = σ(*e*ψ) = {0,1}:

$$p\_{\boldsymbol{\Phi}}^{\boldsymbol{\Psi}}(1) = p\_{\boldsymbol{\Psi}}^{\boldsymbol{\Phi}}(1). \tag{2.14}$$

Expectation values and variances may be defined as in the classical case, viz.

$$E\_{\mathcal{P}}(a) = \text{Tr}(\rho a);\tag{2.15}$$

$$
\Delta\_{\mathcal{P}}(a) = E\_{\mathcal{P}}(a^2) - E\_{\mathcal{P}}(a)^2. \tag{2.16}
$$

Similar to (1.11), we may also write the expectation value as

$$E\_{\mathcal{P}}(a) = \sum\_{\lambda \in \sigma(a)} \lambda \cdot p\_a(\lambda). \tag{2.17}$$

The special case ρ = ρψ, for which we write *E*ρψ ≡ *E*ψ, gives the usual formula

$$E\_{\Psi}(a) = \text{Tr}\,(\mathfrak{\rho}\_{\Psi}a) = \langle \Psi, a\Psi \rangle. \tag{2.18}$$

As in the classical case one always has Δρ (*a*) ≥ 0, but a major contrast between classical and quantum mechanics lies in the following result, cf. Proposition 1.3.

Proposition 2.3. *For each density operator* ρ *there exists a self-adjoint operator b such that* Δρ (*b*) > 0*. On the other hand, if a*<sup>∗</sup> = *a, then* Δρ (*a*) = 0 *iff the image of* ρ *lies in some fixed eigenspace of a, i.e., in terms of the spectral decomposition* (2.6) *we have a*υ*<sup>i</sup>* = λ υ*<sup>i</sup> where* λ *is independent of i.*

*Proof.* We first prove the first claim for *H* = C2. By an appropriate choice of basis, we may assume that ρ is diagonal, i.e., ρ = diag(*p*1, *p*2), with *p*1, *p*<sup>2</sup> ∈ [0,1] and *p*<sup>1</sup> + *p*<sup>2</sup> = 1. Now take *b* = σ*<sup>x</sup>* (i.e., the first Pauli matrix), so that Tr(ρ*b*) = 0 and Tr(ρ*b*2) = 1. Hence Δρ (*b*) = 1. Secondly, for general *H* ∼= C*n*, diagonalize ρ and order the eigenvectors such that the above 2×2 case forms the upper left block, with at least one of the eigenvalues *p*1, *p*<sup>2</sup> strictly positive. Take *b* to be σ*<sup>x</sup>* in the upper left corner, and zero elsewhere. This once again yields Δρ (*b*) = 1.

For the second claim we use (2.6), and write ρ*<sup>i</sup>* ≡ ρυ*<sup>i</sup>* . We note the inequality

$$\Delta\_{\mathcal{P}}(a) \ge \sum\_{i} p\_i \Delta\_{\mathcal{P}\_i}(a),\tag{2.19}$$

with equality iff ρ*i*(*a*) = ρ*j*(*a*) for all *i*, *j*; this follows from convexity of the function *<sup>x</sup>* → *<sup>x</sup>*2. We now show that for any unit vector <sup>ψ</sup> we have Δρψ <sup>=</sup> 0 iff *<sup>a</sup>*<sup>ψ</sup> <sup>=</sup> λψ. Assuming the latter gives *<sup>E</sup>*ψ(*a*) = ψ,*a*ψ <sup>=</sup> <sup>λ</sup> and likewise *<sup>E</sup>*ψ(*a*2) = <sup>λ</sup>2, hence Δρψ (*a*) = 0. In the opposite direction, using *a*<sup>∗</sup> = *a*, elementary manipulations yield

$$\Delta\_{\mathcal{P}\Psi}(a) = \left\|(a - \langle \Psi, a\Psi \rangle)\Psi\right\|^2. \tag{2.20}$$

This clearly vanishes iff *a*ψ = ψ,*a*ψψ, so *a*ψ = λψ, with λ = ψ,*a*ψ.

Putting ψ = υ*<sup>i</sup>* gives Δρ*<sup>i</sup>* = 0 iff *a*υ*<sup>i</sup>* = λ*i*υ*i*, and then Δ∑*<sup>i</sup> pi*ρ*<sup>i</sup>* (*a*) = 0 iff in addition ρ*i*(*a*) = ρ*j*(*a*) for all *i*, *j*. Since ρ*i*(*a*) = υ*i*,*a*υ*i* = λ*i*, we obtain λ*<sup>i</sup>* = λ*j*. -

As first recognized by von Neumann, Theorem 2.2 may be generalized to a family of self-adjoint operators *as long as they commute*. Thus we obtain the following counterpart of (1.12) - (1.13): a collection *a*1,...,*an* of *n commuting* self-adjoint operators and a (single) density operator ρ on *H* jointly define a probability distribution *pa*1,...,*an* on the product σ(*a*1)×···×σ(*an*) of the individual spectra by

$$p\_{a\_1,\ldots,a\_n}(\mathcal{A}\_1,\ldots,\mathcal{A}\_n) = \operatorname{Tr}\left(\rho e^{(1)}\_{\mathcal{A}\_1}\cdots e^{(n)}\_{\mathcal{A}\_n}\right).\tag{2.21}$$

The proof of positivity of these numbers requires the spectral projections *e* (*i*) <sup>λ</sup>*<sup>i</sup>* to commute, which they do provided the *ai* commute (if the *ai* fail to commute, positivity of (2.21) is not guaranteed, although they do still sum op to unity; the possibility of defining joint probabilities is strictly limited to commuting random variables).

#### 2.2 Quantum observables and states

Given a finite-dimensional Hilbert space *H*, the set *B*(*H*) of all linear operators on *H* (which for *<sup>H</sup>* <sup>=</sup> <sup>C</sup>*<sup>n</sup>* may be identified with the set *Mn*(C) of complex *<sup>n</sup>*×*<sup>n</sup>* matrices) forms an involutive algebra under the natural (pointwise) operations

$$(\lambda \cdot a)\psi = \lambda (a\psi);\tag{2.22}$$

$$(a+b)\Psi = a\Psi + b\Psi;\tag{2.23}$$

$$a(ab)\Psi = a(b\Psi),\tag{2.24}$$

and finally with *a*∗ given by the usual operator adjoint (A.15). Compare the corresponding classical expressions (1.18) - (1.20) and (1.22). Analogous to (1.24), we also have a norm on *B*(*H*), defined by (A.18). It follows that *like* its classical counterpart *C*(*X*), the involutive algebra *B*(*H*) (or, in this case, *Mn*(C)) is a C\*-algebra, cf. Definition C.1 in Appendix C. It crucially *differs* from *C*(*X*) in that *B*(*H*) is *non-commutative*. For this reason, the Gelfand spectrum, which in the classical case allowed us to reconstruct *X* from *C*(*X*), turns out to be empty, cf. Proposition 2.10 below. Nonetheless, it makes good sense to copy Definition 1.14, *mutatis mutandis*:

Definition 2.4. *A* state *on B*(*H*) *is a complex-linear map* ω : *B*(*H*) → C *satisfying:*

*1.* ω(*a*∗*a*) ≥ 0 *for each a* ∈ *B*(*H*) *(*positivity*); 2.* ω(1*H*) = 1 *(*normalization*).*

*The* state space *S*(*B*(*H*)) *is the set of all states* ω : *B*(*H*) → C*.*

Physicists may not like this definition, since it involves non-observable quantities. As in the classical case, we may introduce the self-adjoint (or 'real') part of *B*(*H*):

$$B(H)\_{\text{sa}} = \{ a \in B(H) \mid a^\* = a \},\tag{2.25}$$

which is a real vector space (though not a real algebra in the usual sense, cf. §C.25).

Definition 2.5. *A* state *on B*(*H*)sa *is a real-linear map* ω : *B*(*H*)sa → R *satisfying:*

*1.* <sup>ω</sup>(*a*2) <sup>≥</sup> <sup>0</sup> *for each a* <sup>∈</sup> *<sup>B</sup>*(*H*) *with a*<sup>∗</sup> <sup>=</sup> *a (*positivity*); 2.* ω(1) = 1 *(*normalization*).*

*The* state space *S*(*B*(*H*)sa) *is the set of all states* ω : *B*(*H*)sa → R*.*

Fortunately, there is no need for a fight over this point; the discussion is similar to the one below Definition 1.14 and is settled as follows.

Proposition 2.6. *The state spaces S*(*B*(*H*)) *and S*(*B*(*H*)sa) *may be identified: an element* ω *of the former defines an element* ω<sup>R</sup> *of the latter by restriction, whilst the unique decomposition c* = *a*+*ib (where a*<sup>∗</sup> = *a and b*<sup>∗</sup> = *b are given by a* = <sup>1</sup> <sup>2</sup> (*c*+*c*∗) *and b* = −<sup>1</sup> <sup>2</sup> *i*(*c*−*c*∗)*, respectively) gives* ω(*c*) = ωR(*a*) +*i*ωR(*b*)*. Moreover,*

$$\|\|\boldsymbol{\varrho}\|\| = \|\|\boldsymbol{\varrho}\_{\mathbb{R}}\|\| = 1.\tag{2.26}$$

Here the norm on the dual (Banach) space *B*(*H*)∗ sa of *B*(*H*)sa is given by

$$\|\|\boldsymbol{a}\|\| = \sup\{ |\boldsymbol{a}(\boldsymbol{a})|, \boldsymbol{a} \in \mathcal{B}(H)\_{\mathrm{sa}}, \|\|\boldsymbol{a}\|\| = 1 \}. \tag{2.27}$$

This lemma holds for any Hilbert space *H* (cf. Theorem C.52), but it is instructive to restrict our proof to the finite-dimensional setting in which we currently work.

*Proof.* The first few claims are immediate from Proposition A.22. To prove (2.26), it suffices to prove that for any *a* ∈ *B*(*H*) one has

$$|\mathfrak{o}(a)| \le ||a||,\tag{2.28}$$

since by normalization of states the bound is saturated by *a* = 1*H*. Furthermore, even if ω is seen as an element of *B*(*H*)∗ rather than *B*(*H*)∗ sa, eq. (2.28) needs to be shown only for self-adjoint *a*, for positivity of ω implies the Cauchy–Schwarz inequality

$$|\mathfrak{o}(a^\*b)|^2 \le \mathfrak{o}(a^\*a)\mathfrak{o}(b^\*b),\tag{2.29}$$

cf. (A.1), in which we may take *a* = 1*<sup>H</sup>* to find, assuming (2.28) for self-adjoint *a*,

$$\left|\left|\mathfrak{o}(b)\right|^{2} \leq \mathfrak{o}(b^{\*}b) \leq \left\|b^{\*}b\right\| = \left\|b\right\|^{2},\tag{2.30}$$

where the last equality holds for any *b* ∈ *B*(*H*) (turning the latter into a C\*-algebra). Noting that *b*∗*b* is self-adjoint, this gives (2.28) for any *a*. To prove (2.28) for *a*∗ = *a*, then, we firstly use (A.47), and secondly use Theorem 2.7 and eq. (2.6) to obtain

$$|\mathfrak{o}(a)| = |\mathrm{Tr}\,(\mathfrak{p}a)| = |\sum\_{i} p\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle| \le \sum\_{i} p\_{i} |\langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle|. \tag{2.31}$$

Now let (ξ*j*) be a basis of *H* consisting of eigenvectors of *a*, so that

$$\langle \mathfrak{v}\_{\boldsymbol{i}}, a\mathfrak{v}\_{\boldsymbol{i}} \rangle = \sum\_{j} |\langle \mathfrak{v}\_{\boldsymbol{i}}, \mathfrak{f}\_{j} \rangle|^{2} \mathfrak{A}\_{j}, \\ \sum\_{j} |\langle \mathfrak{v}\_{\boldsymbol{i}}, \mathfrak{f}\_{j} \rangle|^{2} = 1.$$

Since |λ*j*|≤*a* and ∑*<sup>i</sup> pi* = 1, the bound (2.28) follows from the estimate

$$\sum\_{i} p\_i |\langle \mathfrak{v}\_i, a\mathfrak{v}\_i \rangle| \le \sum\_{i} p\_i \sum\_{j} |\langle \mathfrak{v}\_i, \mathfrak{f}\_j \rangle|^2 |\mathfrak{A}\_j| \le \sum\_{i} p\_i \sum\_{j} |\langle \mathfrak{v}\_i, \mathfrak{f}\_j \rangle|^2 ||a|| = ||a||. \tag{2.32}$$

Finally, combining (2.31) and (2.32) gives (2.28) for self-adjoint *a*. -

In view of this, we may work with either *S*(*B*(*H*)sa) or *S*(*B*(*H*)); denoting states simply by ω, the context will usually show if it is defined on *B*(*H*)sa or on *B*(*H*).

Despite its easy proof, the following result is of fundamental importance.

Theorem 2.7. *If H is finite-dimensional, there is a bijective correspondence between states* ω *on B*(*H*) *or B*(*H*)sa *and density operators* ρ *on H, given by*

$$
\rho a(a) = \text{Tr}\,(\rho a). \tag{2.33}
$$

$$\implies$$

*Proof.* First note that linear algebra already yields (2.33) as a bijective correspondence between complex-linear maps ω and operators ρ, for example, because

$$
\langle a, b \rangle = \text{Tr} \left( a^\* b \right) \tag{2.34}
$$

defines an inner product on *B*(*H*). Positivity and normalization of ω then translate to the corresponding properties of ρ. -

The quantum analogue of Theorem 1.15, then, is as follows.

Theorem 2.8. *The state space S*(*B*(*H*)sa) = *S*(*B*(*H*)) *forms a compact convex set in the (real) vector space B*(*H*)∗ sa *(in its w*∗*-topology) and, putting the corresponding topology on* D(*H*)*, eq.* (2.33) *defines an affine homeomorphism*

$$\mathcal{S}(\mathcal{B}(H)) \cong \mathcal{Q}(H). \tag{2.35}$$

*Proof.* Convexity of *S*(*B*(*H*)) holds by Definition 2.4. For compactness, by Proposition 2.6 the state space *S*(*B*(*H*)) is contained in the closed unit ball *B*<sup>1</sup> of *B*(*H*)<sup>∗</sup> sa, which is compact in the *w*∗-topology (in the case at hand this is simply because *B*(*H*)∗ sa is finite-dimensional). It is easy to see that a convergent sequence of states actually converges to a state, since both conditions in Definition 2.4 are clearly preserved by *w*<sup>∗</sup> limits (in which ω*<sup>n</sup>* → ω iff ω*n*(*a*) → ω(*a*) for each *a* ∈ *B*(*H*)). -

For infinite-dimensional Hilbert spaces eq. (2.35) is false; see §4.2. At the opposite end, the case *H* = C<sup>2</sup> provides a beautiful illustration of this theorem (and more).

Proposition 2.9. *The state space S*(*M*2(C)) *of the* 2×2 *matrices is isomorphic (as a compact convex set) to the closed unit ball B*<sup>3</sup> <sup>=</sup> {(*x*, *<sup>y</sup>*,*z*) <sup>∈</sup> <sup>R</sup><sup>3</sup> <sup>|</sup> *<sup>x</sup>*<sup>2</sup> <sup>+</sup>*y*<sup>2</sup> <sup>+</sup>*z*<sup>2</sup> <sup>≤</sup> <sup>1</sup>}*. On this isomorphism, the extreme boundary (cf. Definition 1.10)*

$$
\partial\_{\epsilon} B^3 = S^2 = \{ (\mathbf{x}, \mathbf{y}, \mathbf{z}) \in \mathbb{R}^3 \mid \mathbf{x}^2 + \mathbf{y}^2 + \mathbf{z}^2 = 1 \} \tag{2.36}
$$

*corresponds to the set of all density matrices* <sup>ρ</sup> <sup>=</sup> ρψ*, where* <sup>ψ</sup> <sup>∈</sup> <sup>C</sup><sup>2</sup> *with* ψ <sup>=</sup> <sup>1</sup>*.*

*Proof.* Any self-adjoint 2×2 matrix may be parametrized by (*t*, *<sup>x</sup>*, *<sup>y</sup>*,*z*) <sup>∈</sup> <sup>R</sup><sup>4</sup> as

$$
\rho(t, \mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{2} \begin{pmatrix} t+z & \mathbf{x}-i\mathbf{y} \\ \mathbf{x}+i\mathbf{y} & t-z \end{pmatrix} . \tag{2.37}
$$

The eigenvalues λ*<sup>i</sup>* of ρ(*t*, *x*, *y*,*z*), computed from its characteristic polynomial, are

$$
\lambda\_{\pm} = \frac{1}{2} (t \pm \sqrt{x^2 + y^2 + z^2}). \tag{2.38}
$$

Condition (2.5) yields *t* = 1. Positivity of ρ(1, *x*, *y*,*z*) is equivalent to positivity of its eigenvalues <sup>λ</sup>*i*, which gives *<sup>x</sup>*<sup>2</sup> <sup>+</sup>*y*<sup>2</sup> <sup>+</sup>*z*<sup>2</sup> <sup>≤</sup> 1. For the second claim, note that the ρψ are just the one-dimensional projections, which in turn are the density matrices satisfying <sup>ρ</sup><sup>2</sup> <sup>=</sup> <sup>ρ</sup> (or require <sup>λ</sup><sup>+</sup> <sup>=</sup> 1, <sup>λ</sup><sup>−</sup> <sup>=</sup> 0), so *<sup>x</sup>*<sup>2</sup> <sup>+</sup> *<sup>y</sup>*<sup>2</sup> <sup>+</sup> *<sup>z</sup>*<sup>2</sup> <sup>=</sup> 1. Finally, since convex sums *<sup>t</sup>*<sup>v</sup> + (<sup>1</sup> <sup>−</sup>*t*)<sup>w</sup> in *<sup>B</sup>*<sup>3</sup> (0 <sup>≤</sup> *<sup>t</sup>* <sup>≤</sup> 1) are given by straight line segments connecting w and v in R3, it immediately follows geometrically that ∂*eB*<sup>3</sup> = *S*2. -

#### 2.3 Pure states in quantum mechanics

In classical physics, the phase space *X* arose both as the Gelfand spectrum Σ(*C*(*X*)) of the C\*-algebra of observables *C*(*X*), cf. Definition 1.4 and Proposition 1.5, and as the pure state space *P*(*C*(*X*)) of *C*(*X*), see Definition 1.10 and Theorem 1.16. In particular, Σ(*C*(*X*)) ∼= *P*(*C*(*X*)) at least as sets. Because of this, any pure state ω ∈ *P*(*C*(*X*)) is dispersion-free, since as an element of Σ(*C*(*X*)) it satisfies <sup>ω</sup>(*<sup>f</sup>* <sup>2</sup>) = <sup>ω</sup>(*f*)<sup>2</sup> for any *<sup>f</sup>* <sup>∈</sup>*C*(*X*). These two definitionally different (but classically coinciding) guises of *X* will fall apart in quantum mechanics; cf. Proposition 2.3.

Proposition 2.10. *If* dim(*H*) > 1*, the Gelfand spectrum* Σ(*B*(*H*)) *of B*(*H*) *is empty, i.e., there are no nonzero linear maps* ω : *B*(*H*) → C *that satisfy* ω(*ab*) = ω(*a*)ω(*b*)*.*

*In particular, there are no nonzero linear maps* ω : *B*(*H*)→ C *that are* dispersionfree*, i.e., satisfy* Δω(*a*) = <sup>0</sup>*, with* Δω(*a*) = <sup>ω</sup>(*a*2)−ω(*a*)2*.*

*Proof.* Suppose ω ∈ Σ(*B*(*H*)). Multiplicativity for *b* = *a* = *a*<sup>∗</sup> implies that ω is positive, whereas for *b* = 1*<sup>H</sup>* it implies that ω is normalized. Hence ω must be a state. Now use Theorem 2.7 and use multiplicativity for *b* = *a* = *a*∗, implying that Δρ (*a*) = 0. This contradicts Proposition 2.3. -

On the other hand, the pure state space of *B*(*H*) is by no means empty, and despite Proposition 2.10, we will see that the special density operators ρψ ≡ *e*<sup>ψ</sup> in (2.7) to some extent do play the role of the points *x* ∈ *X*. Let us write

$$\mathcal{O}^{\Phi}\_{1}(H) = \{ e \in \mathcal{B}(H) \mid e^{2} = e^{\*} = e, \text{Tr}\,(e) = 1 \}\tag{2.39}$$

for the set of all one-dimensional projections on *H*; note that Tr(*e*) = dim(*eH*) for *e* ∈ P(*H*). Each *e* ∈ P1(*H*) takes the form *e* = *e*<sup>ψ</sup> for some unit vector ψ, see (2.7).

Lemma 2.11. *A density operator* ρ *is an extreme point of the convex set* D(*H*) *of all density operators on H iff* ρ = ρψ *for some unit vector* ψ ∈ *H.*

*Proof.* The argument is similar to the proof of Proposition 1.11. To show that ρψ ∈ ∂*eS*(*B*(*H*)), assume ρψ = *t*ρ<sup>1</sup> + (1−*t*)ρ<sup>2</sup> for some *t* ∈ (0,1) and ρ1,ρ<sup>2</sup> ∈ *S*(*B*(*H*)). Evaluating this equality at *a* = |ϕϕ|, where ϕ ⊥ ψ yields ϕ,ρ*i*ϕ = 0 for *i* = 1,2, so that ρ<sup>1</sup> = ρ<sup>2</sup> = ρψ. Conversely, the spectral decomposition (2.6) shows that ρ ∈/ ∂*eS*(*B*(*H*)) whenever ρ = ρψ for some unit vector ψ ∈ *H*. -

Consequently, for the moment just as sets (and even as topological spaces), one has

$$P(\mathcal{O}(H)) = \mathcal{P}\_1(H);\tag{2.40}$$

$$P(B(H)) \cong \mathcal{P}\_1(H),\tag{2.41}$$

where the second isomorphism is given by (2.33). Defining a state ωψ by

$$a\mathfrak{o}\_{\Psi}(a) = \langle \Psi, a\Psi \rangle,\tag{2.42}$$

cf. (2.18), the isomorphism (2.41) is the correspondence ωψ ↔ *e*ψ, cf. (2.7).

This isomorphism becomes more interesting if we note that both spaces are naturally equipped with *transition probabilities*. For *P*(*B*(*H*)) we canonically have

$$\pi^{B(H)}(a\bullet\_{\Psi}, a\bullet\_{\Psi}) = \inf\{a\bullet\_{\Psi}(a) \mid a \in B(H), 0 \le a \le 1\_H, a\bullet\_{\Psi}(a) = 1\},\tag{2.43}$$

as in (1.38) for *A* = *B*(*H*). Furthermore, on P1(*H*) we define (with some foresight)

$$\pi^{\beta^\bullet 1(H)}(e, f) = \text{Tr}\,(ef). \tag{2.44}$$

Theorem 2.12. *The pairs* (*P*(*B*(*H*)), τ*B*(*H*) ) *and* (P1(*H*), τP1(*H*) ) *are isomorphic as sets with a transition probability. In particular, we have, cf.* (2.13)*,*

$$|\pi^{B(H)}(o\_{\Psi}, o\_{\Phi}) = |\langle \Psi, \Phi \rangle|^2 = \text{Tr}\,(e\_{\Psi}e\_{\Phi}) = \pi^{\beta^b\_1(H)}(e\_{\Psi}, e\_{\Phi}).\tag{2.45}$$

*Proof.* The last equality is a simple computation. The first follows if we can show that the infimum in (2.43) is reached at *a* = *e*ϕ. To this end, we prove that for any <sup>0</sup> <sup>≤</sup> *<sup>a</sup>* <sup>≤</sup> <sup>1</sup>*<sup>H</sup>* with ωϕ(*a*) = 1 we must have ψ,*a*ψ ≥ |ϕ,ψ|2. Indeed, the condition ωϕ(*a*) = ϕ,*a*ϕ = 1 with *a* ≤ 1 (which follows from 0 ≤ *a* ≤ 1*H*) and ϕ = 1 imply, by Cauchy–Schwarz, that *a*ϕ = ϕ. Since *a*∗ = *a* (by positivity of *a*), we also have *a* : (C · ϕ)<sup>⊥</sup> → (C · ϕ)⊥, so we may write *a* = *e*<sup>ϕ</sup> + *a* , with *a* ϕ = 0 and *a* mapping (<sup>C</sup> ·ϕ)<sup>⊥</sup> to itself. Then *<sup>a</sup>* <sup>≥</sup> 0 implies *<sup>a</sup>* <sup>≥</sup> 0. If ψ,*a*ψ <sup>&</sup>lt; |ϕ,ψ|2, then ψ,*a* ψ < 0, which contradicts positivity of *a* (and hence of *a*). -

The theory of observables and spectral resolutions of the kind (1.45) may be worked out completely for the "quantum" transition probabilities in this theorem:

Proposition 2.13. *1. There is a bijective correspondence between self-adjoint operators a* <sup>∈</sup> *<sup>B</sup>*(*H*) *and observables f on* (P1(*H*), <sup>τ</sup>P1(*H*) ) *a la Definition 1.18.6: `*

• *Given a self-adjoint operator a, define an observable fa at e*<sup>ψ</sup> ∈ P1(*H*) *by*

$$f\_a(e\_\Psi) = \text{Tr}(e\_\Psi a) = \langle \Psi, a\Psi \rangle;\tag{2.46}$$

• *Given an observable f* = ∑*<sup>i</sup> ci*τ <sup>P</sup>1(*H*) *ei , define an operator af by*

$$a\_f = \sum\_i c\_i e\_i. \tag{2.47}$$

*2. Each such observable f* = *fa has a unique spectral resolution as in* (1.45)*, i.e.,*

$$f\_a = \sum\_{\lambda \in \sigma(a)} \lambda \cdot \mathfrak{r}\_{\mathbb{S}\_{\lambda}},\tag{2.48}$$

*where S*<sup>λ</sup> *is the (automatically orthoclosed) subset of* P1(*H*) *whose elements e satisfy eH* ⊆ *H*<sup>λ</sup> *, where H*<sup>λ</sup> ⊆ *H is the eigenspace for the eigenvalue* λ ∈ σ(*a*)*. 3. The product defined by* (1.46) *-* (1.47) *is equal to*

$$f\_a^2 = f\_{a^2};\tag{2.49}$$

$$f\_a \circ f\_b = f\_{(ab+ba)/2}.\tag{2.50}$$

*Proof.* Any spectral decomposition *a* = ∑*<sup>i</sup>* λ*i*|υ*i*υ*i*| puts *fa* as defined in (2.46) in the general form (1.44), with *ci* = λ*<sup>i</sup>* and *yi* = *e*υ*<sup>i</sup>* . The rest should be clear. -

We now turn to the quantum counterpart of Proposition 1.13. The main difference is that although extremal decompositions of mixed states into pure ones always exist, they are no longer unique. For example, for *H* = C2, we have

 $\rho \equiv \text{diag}(2/3, 1/3) = \frac{2}{5}$  $\rho\_{u\_1} + \frac{1}{5}$  $\rho\_{u\_2} = \frac{1}{2}(\rho\_{\tilde{\xi}\_1} + \rho\_{\tilde{\xi}\_2}),$ 

where (*u*1,*u*2) is the standard basis of C2, and

$$\mathfrak{F}\_1 = (\sqrt{2/3}, \sqrt{1/3}), \ \mathfrak{F}\_2 = (\sqrt{2/3}, -\sqrt{1/3}).$$

More generally, take any basis (*wi*) of *H* ∼= C*n*, assume (2.6), and for each *i* for which <sup>√</sup>ρ*wi* <sup>=</sup> 0 (where <sup>√</sup><sup>ρ</sup> <sup>=</sup> <sup>∑</sup>*<sup>i</sup>* √*pi* |υ*i*υ*i*|), define *ti* = <sup>√</sup>ρ*wi*2, as well as the unit vector <sup>ξ</sup>*<sup>i</sup>* <sup>=</sup> <sup>√</sup>ρ*wi*/ <sup>√</sup>ρ*wi*. Then <sup>ρ</sup> <sup>=</sup> <sup>∑</sup>*<sup>i</sup> ti*ρξ*<sup>i</sup>* is an extremal decomposition of ρ. The above example corresponds to the special case *t*<sup>1</sup> = *t*<sup>2</sup> = 1/2, with

$$m = 2, \, p\_1 = 2/3, \, p\_2 = 1/3, \, w\_1 = (1/\sqrt{2}, 1/\sqrt{2}), \, w\_2 = (1/\sqrt{2}, -1/\sqrt{2}).$$

One might require the ξ*<sup>i</sup>* to be mutually orthogonal, but even that does not imply uniqueness of the extremal decomposition: take, for example, ρ = (1/*n*)· 1*n*, where <sup>1</sup>*<sup>n</sup>* is the *<sup>n</sup>*×*<sup>n</sup>* unit matrix on *<sup>H</sup>* <sup>=</sup> <sup>C</sup>*n*. Then any basis induces (2.6).

Nonetheless, under appropriate assumptions uniqueness does follow.

Proposition 2.14. *1. Any density operator* ρ *on H has an extremal decomposition*

$$
\rho = \sum\_{i=1}^{m} p\_i \rho\_{\Psi},\tag{2.51}
$$

*where m* ≤ dim(*H*)*, the pi are probabilities, and the* ψ*<sup>i</sup> are distinct unit vectors.*

*2. This decomposition can be chosen such that the* ψ*<sup>i</sup> are mutually orthogonal, in which case it is unique iff each of the non-zero eigenvalues of* ρ *is simple.*

*Proof.* The existence of the *extremal* decomposition (2.51) of ρ follows from its *spectral* decomposition (2.6), which also proves claim 2. If ρ has some degenerate non-zero eigenvalue, the example just given yields non-uniqueness of (2.51). For the converse direction, use uniqueness of the decomposition (2.6) under the condition that each of the non-zero eigenvalues of ρ is simple. -

In the light of Theorem 2.7, it would be interesting to reformulate Proposition 2.14 directly in terms of the states on *B*(*H*); note our standing assumption dim(*H*) < ∞!

Proposition 2.15. *1. Any state* ω *on B*(*H*) *has an extremal decomposition*

$$\mathbf{oo} = \sum\_{i=1}^{m} p\_i \mathbf{oo}\_i,\tag{2.52}$$

*into distinct pure states* ω*<sup>i</sup>* ∈ *P*(*B*(*H*))*, where m* ≤ dim(*H*)*, pi* > 0*, and* ∑*<sup>i</sup> pi* = 1*.*


$$\|\|\mathbf{o}\_l - \mathbf{o}\_j\|\| = 2 \ (i \neq j). \tag{2.53}$$

*3. Extremal decompositions* (2.52) *satisfying* (2.53) *exist and correspond bijectively to orthogonal families* (*ei*) *of one-dimensional projections on H (i.e., eie <sup>j</sup>* = δ*i jei and* Tr(*ei*) = 1*, respectively) for which* ω(*ei*) > 0*,* ∑*i*ω(*ei*) = 1*, and*

$$\mathfrak{o}(ae\_i) = \mathfrak{o}(e\_i a), \ a \in \mathcal{B}(H). \tag{2.54}$$

*In terms of such a family, the decomposition* (2.52) *is given by*

$$p\_i = \mathfrak{o}(e\_i);\tag{2.55}$$

$$\mathfrak{so}\_l(a) = \frac{\mathfrak{o}(ae\_l)}{\mathfrak{o}(e\_l)}.\tag{2.56}$$

*Hence an extremal decomposition* (2.52) *with all* ω*<sup>i</sup> mutually orthogonal in the sense of* (2.53) *is unique iff the family* (*ei*) *with the above properties is.*

*Proof.* Claim 1 clearly follows from no. 3. To prove (2.53), assume (2.42), so that

$$\|\|a\|-a\mathbb{I}\_{j}\| = \sup\{ |\langle \Psi\_{l}, a\Psi\_{l} \rangle - \langle \Psi\_{f}, a\Psi\_{f} \rangle|, a \in \mathcal{B}(H), \|a\| = 1 \}.\tag{2.57}$$

Clearly, |ψ,*a*ψ| ≤ 1 when *a* = ψ = 1, hence |ψ*i*,*a*ψ*i*−ψ*j*,*a*ψ*j*| ≤ 2, and the upper bound ω*<sup>i</sup>* − ω*j* = 2 in (2.57) is reached iff |ψ1,*a*ψ1| = 1 and ψ2,*a*ψ2 = −ψ1,*a*ψ1. By Cauchy–Schwarz, this holds iff *a*ψ<sup>1</sup> = λψ<sup>1</sup> as well as *a*ψ<sup>2</sup> = −λψ<sup>2</sup> for some λ ∈ T. If ψ*<sup>i</sup>* ⊥ ψ*j*, then this is accomplished by the operator *a* = |ψ*i*ψ*i*|−|ψ*j*ψ*j*|; note that σ(*a*) = {−1,1} for dim(*H*) = 2 and σ(*a*) = {−1,0,1} for dim(*H*) > 2, so indeed *a* = 1 by (A.47). If, on the other hand, ψ*i*,ψ*j* = 0, then no *a* with *a* = 1 can meet these eigenvalue equations. One way to see this is to reduce to *H* = C2, since *a* in (2.57) can be replaced by *eae*, where *e* is the projection onto the linear span of ψ*<sup>i</sup>* and ψ*j*. Picking a basis of C<sup>2</sup> (with say υ<sup>1</sup> = ψ1), the two eigenvalue equations for *a* yield a matrix representation of *<sup>a</sup>*, from which *a*<sup>2</sup> <sup>=</sup> *a*∗*a* may be computed by calculating the eigenvalues of *a*∗*a* and using (A.47). This gives *a* > 1 unless ψ*i*,ψ*j* = 0.

One direction of the proof of the third claim easily follows from Theorem 2.7: any spectral decomposition (2.6) of ρ provides the projections

$$|e\_i = |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i|\tag{2.58}$$

of the proposition. For example, eq. (2.54) comes down to [ρ, *ei*] = 0, which is the case iff *ei* commutes with all spectral projections of ρ, which clearly holds for (2.58). Uniqueness of the *ei* then corresponds to uniqueness of (2.6) and hence to non-degeneracy of the non-zero eigenvalues *pi* of ρ, as in Proposition 2.14.

The opposite direction, i.e., proving that (2.58) exhausts all possibilities for (2.53) - (2.54), is based on the GNS-construction and requires an entire subsection.

#### 2.4 The GNS-construction for matrices

The proof of Proposition 2.15 may be completed on the basis of the GNS-construction began in §1.5, which in this subsection we develop for *A* = *B*(*H*), where, as usual, dim(*H*) < ∞. In that case, we may use Theorem 2.7 to simplify matters.

First, to prove (1.76) we use (2.33) and cyclicity of the trace, compute the trace by summing over a basis (υ*i*) of eigenvectors of *a*∗*a*, say *a*∗*a*υ*<sup>i</sup>* = μ*i*υ*i*, where μ*<sup>i</sup>* ≥ 0 by positivity of *a*∗*a*, and use (A.47) (for *a*∗*a* rather than *a*) to obtain:

$$\begin{aligned} \mathfrak{a}(b^\*a^\*ab) &= \mathrm{Tr}\left(\mathfrak{p}b^\*a^\*ab\right) = \sum\_i \langle \mathfrak{v}\_i, b\mathfrak{p}b^\*a^\*a\mathfrak{v}\_i \rangle = \sum\_i \mu\_i \langle \mathfrak{v}\_i, b\mathfrak{p}b^\*\mathfrak{v}\_i \rangle \\ &\le ||a^\*a|| \sum\_i \langle \mathfrak{v}\_i, b\mathfrak{p}b^\*\mathfrak{v}\_i \rangle = ||a||^2 \mathrm{Tr}\left(\mathfrak{p}b^\*b\right) = ||a||^2 \mathfrak{a}(b^\*b), \end{aligned}$$

where we used υ*i*,*b*ρ*b*∗υ*i* = *b*∗υ*i*,ρ*b*∗υ*i* ≥ 0 to justify the inequality.

We now explain all cases of interest, paying special attention to the *commutant*

$$
\pi\_{\mathfrak{o}\mathfrak{o}}(A)' = \{ B \in B(H\_{\mathfrak{o}\mathfrak{o}}) \mid \pi\_{\mathfrak{o}\mathfrak{o}}(a)B = B\pi\_{\mathfrak{o}\mathfrak{o}}(a) \,\forall a \in A \};\tag{2.59}
$$

to distinguish operators on *H* from operators on *H*ω, we write the latter in capitals. For simplicity we also put *H* = C*<sup>n</sup>* (with the standard inner product), so that

$$B(H) = M\_n(\mathbb{C}),\tag{2.60}$$

and all operators are matrices. Performing a suitable unitary transformation or change of basis if necessary, we also assume that the unit vectors υ*<sup>i</sup>* in the spectral decomposition (2.6) of ρ form (all or part of) the standard basis (υ1,...,υ*n*) of C*n*. As in (1.74), we denote the null space by

$$N\_{\mathcal{P}} = \{ a \in \mathcal{B}(H) \mid \text{Tr}(\rho a^\* a) = 0 \}. \tag{2.61}$$

• If ρ = |υ*j*υ*j*|, the corresponding pure state (2.42) is ω(*a*) = υ*j*,*a*υ*j*, with

$$N\_{\mathcal{P}} = \{ a \in A \mid a \mathfrak{v}\_{\mathcal{I}} = 0 \}. \tag{2.62}$$

Hence *a* ∈ *N*<sup>ρ</sup> iff the *j*'th column *Cj*(*a*) of *a* vanishes, so we have *a* − *b* ∈ *N*<sup>ρ</sup> iff *Cj*(*a*) = *Cj*(*b*). Thus the equivalence class *a*<sup>ρ</sup> ∈ *Mn*(C)/*N*<sup>ρ</sup> may be identified with *Cj*(*a*). Consequently, we obtain

$$H\_{\mathfrak{P}} = M\_n(\mathbb{C}) / N\_{\mathfrak{P}} \cong \mathbb{C}^n,\tag{2.63}$$

under the unitary isomorphism *<sup>u</sup>* : *<sup>H</sup>*<sup>ρ</sup> <sup>→</sup> <sup>C</sup>*n*, *<sup>a</sup>*<sup>ρ</sup> →*Cj*(*a*), with inverse *<sup>u</sup>*−<sup>1</sup> :*z*→ *<sup>a</sup>*<sup>ρ</sup> , *<sup>z</sup>* <sup>∈</sup> <sup>C</sup>*n*, where *<sup>a</sup>* is the matrix with *Cj*(*a*) = *<sup>z</sup>* and zeros elsewhere (i.e., *ai j* <sup>=</sup> *zi* and *aik* <sup>=</sup> 0 for all *<sup>i</sup>* and *<sup>k</sup>* <sup>=</sup> *<sup>j</sup>*). We likewise write *<sup>u</sup>*−1*<sup>w</sup>* <sup>=</sup> *<sup>b</sup>*<sup>ρ</sup> , with *bi j* <sup>=</sup> *wi* and *bik* <sup>=</sup> <sup>0</sup> for all *i* and *k* = *j*. With *ua*<sup>ρ</sup> = *z* and *ub*<sup>ρ</sup> = *w*, we obtain (beware: no sum over *j*!):

$$\langle a\_{\rho}, b\_{\rho} \rangle = \text{Tr} \,(\rho a^\* b) = \sum\_{i} \overline{a\_{ij}} b\_{ij} = \sum\_{i} \overline{z}\_i w\_i = \langle z, w \rangle\_{\mathbb{C}^n} = \langle \mu a\_{\rho}, \mu b\_{\rho} \rangle\_{\mathbb{C}^n}.$$

The GNS-representation πρ , originally given on *H*<sup>ρ</sup> by (1.77), is accordingly transformed to *<sup>u</sup>*πρ (*a*)*u*−<sup>1</sup> <sup>≡</sup> <sup>π</sup>ˆ<sup>ρ</sup> on <sup>C</sup>*n*, which is given by

$$
\hat{\pi}\_{\mathcal{P}}(a)\!\!w = \iota \pi\_{\mathcal{P}}(a)\!b\_{\mathcal{P}} = \iota(ab)\_{\mathcal{P}} = C\_f(ab) = a\kappa,
$$

and the cyclic vector *<sup>u</sup>*Ωρ <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* is just the basis vector <sup>υ</sup>*<sup>j</sup>* from which we started. More generally, for a pure state (2.42) the GNS-representation πωψ (*Mn*(C)) is equivalent to the defining representation on C*n*, with canonical cyclic vector ψ. Finally, since only multiples of the unit matrix commute with all matrices, it follows that

$$
\pi\_{\mathfrak{o}\_{\Psi}}(M\_n(\mathbb{C}))' \cong \mathbb{C}.\tag{2.64}
$$

• The 'opposite' case occurs when ρ is *invertible*, in other words, when the sum over *i* in (2.6) has *n* nonzero terms. Hence

$$\operatorname{Tr}\left(\rho a^\* a\right) = \sum\_{i=1}^n p\_i ||a\mathbf{v}\_i||^2\tag{2.65}$$

vanishes iff *a*υ*<sup>i</sup>* = 0 for each *i*, i.e., *a* = 0, so that *N*<sup>ρ</sup> = {0} and hence

$$H\_{\mathcal{P}} = M\_n(\mathbb{C}).\tag{2.66}$$

The GNS-constructed inner product on *Mn*(C), cf. (1.78), given by

$$
\langle a\_{\rho}, b\_{\rho} \rangle = \text{Tr} \,(\rho a^\* b), \tag{2.67}
$$

may be transformed into the usual one (2.34) by the following linear map:

$$
\mu: M\_n(\mathbb{C}) \to M\_n(\mathbb{C});\tag{2.68}
$$

$$
ua\_{\mathfrak{p}} = a\_{\mathfrak{p}} \mathfrak{p}^{1/2}.\tag{2.69}$$

This map is unitary from the Hilbert space (*Mn*(C),·,·<sup>ρ</sup> ) to the Hilbert space (*Mn*(C),·,·), for it is invertible, with inverse *<sup>u</sup>*−1*<sup>a</sup>* <sup>=</sup> *<sup>a</sup>*<sup>ρ</sup> <sup>ρ</sup>−1/2, as well as isometric:

$$<\langle \mu(a), \mu(b) \rangle = \text{Tr}\left(\mathfrak{p}^{1/2} a^\* b \mathfrak{p}^{1/2}\right) = \text{Tr}\left(\mathfrak{p} a^\* b\right) = \langle a\_{\mathfrak{p}}, b\_{\mathfrak{p}} \rangle.$$

The transformed representation πˆ<sup>ρ</sup> = *u*πρ (*a*)*u*−<sup>1</sup> on *Mn*(C) is simply given by

$$
\hat{\pi}\_{\mathcal{P}}(a)b = ab,\tag{2.70}
$$

and the cyclic vector *u*Ωρ in *Mn*(C) becomes ρ1/2, so that, as in (1.73),

$$<\langle \rho^{1/2}, \hat{\pi}\_{\mathcal{P}}(a)\rho^{1/2}\rangle = \text{Tr}\,(\rho a). \tag{2.71}$$

In this case, the commutant is easily computed to be

$$\mathfrak{k}\_{\mathcal{P}}(M\_n(\mathbb{C}))' \cong M\_n(\mathbb{C}),\tag{2.72}$$

since any linear map *C* : *Mn*(C) → *Mn*(C) that satisfies *C*(*ab*) = *aC*(*b*) for each *a*,*b* ∈ *Mn*(C) is of the form *C*(*a*) = *ac* ≡ *Rc*(*a*) for some *c* ∈ *Mn*(C), namely *c* = *C*(1); to see this, just take *b* = 1. Since this involves *right* multiplication *Rc* by *c*, which messes up the order in that *RcRd* = *Rdc*, one has a choice in implementing the isomorphism (2.72) either as a *linear anti-homomorphism* (of algebras) *C* → *Rc*, or as an *anti-linear homomorphism C* → *Rc*<sup>∗</sup> (see also Theorem C.159).

Further insight into the structure of this representation comes from the realization

$$M\_n(\mathbb{C}) \cong \mathbb{C}^n \otimes \mathbb{C}^n,\tag{2.73}$$

as Hilbert spaces under the unitary map *v* : *a* → ∑*i j ai j*υ*<sup>i</sup>* ⊗υ*j*. This yields

$$\text{wt}\_{\mathfrak{P}}(a)\boldsymbol{\nu}^\* = a \otimes \mathbf{1}\_n,\tag{2.74}$$

as an operator on <sup>C</sup>*<sup>n</sup>* <sup>⊗</sup>C*n*, and indeed for any Hilbert spaces *<sup>H</sup>*1,*H*<sup>2</sup> one has

$$(B(H\_1)\bigotimes \mathbb{C} \cdot \mathbf{1}\_{H\_2})' = \mathbb{C} \cdot \mathbf{1}\_{H\_1} \bigotimes B(H\_2). \tag{2.75}$$

• Finally, in the 'intermediate' case the sum in the spectral decomposition (2.6) has 1 < *m* < *n* nonzero terms. Using the ensuing (partial) basis (υ1,...,υ*m*) of C*<sup>m</sup>* (viz. C*n*), analogously to (2.66) with (2.73) we obtain, up to unitary equivalence,

$$H\_{\mathfrak{P}} \cong \mathbb{C}^n \otimes \mathbb{C}^m;\tag{2.76}$$

$$
\pi\_{\mathfrak{P}}(a) \cong a \otimes 1\_{\mathfrak{m}};\tag{2.77}
$$

$$\mathfrak{Q}\_{\mathfrak{P}} \cong \sum\_{l=1}^{n} \sqrt{p\_l} \,\mathfrak{v}\_l \otimes \mathfrak{v}\_l;\tag{2.78}$$

$$\pi\_{\mathfrak{P}}(M\_n(\mathbb{C}))' \cong M\_m(\mathbb{C}).\tag{2.79}$$

The relevance of all this to the decomposition of states on *B*(*H*) is as follows.

Proposition 2.16. *Let* ω *be a state on B*(*H*) ∼= *Mn*(C)*. Then each decomposition*

$$\mathbf{co} = \sum\_{i} p\_i \mathbf{o}\_i,\tag{2.80}$$

*where the pi are probabilities (but the states* ω*<sup>i</sup> are not necessarily pure) is induced by a family* (*Ai*) *of nonzero operators in the commutant* πω(*B*(*H*)) *that satisfy:*

$$0 \le A\_i \le 1;\tag{2.81}$$

$$\sum\_{i} A\_{i} = 1.\tag{2.82}$$

*Namely, given such a family of operators Ai, the decomposition* (2.80) *is given by:*

$$p\_i = \langle \mathfrak{Q}\_{\mathfrak{op}}, A\_i \mathfrak{Q}\_{\mathfrak{op}} \rangle;\tag{2.83}$$

$$\mathfrak{so}\_l(a) = \frac{\langle \mathfrak{Q}\_{ao}, \pi\_{oo}(a) A\_i \mathfrak{Q}\_{oo} \rangle}{\langle \mathfrak{Q}\_{ao}, A\_i \mathfrak{Q}\_{oo} \rangle}. \tag{2.84}$$

*Proof.* The claim that such a family yields (2.80) is trivial, except for the remark that automatically *pi* <sup>&</sup>gt; 0, since Ωω,*Ai*Ωω <sup>=</sup> 0 would imply <sup>√</sup>*Ai*Ωω <sup>=</sup> 0 and hence

$$
\sqrt{A\_i}a\_{\mathfrak{o}\mathfrak{o}} = \sqrt{A\_i}\pi\_{\mathfrak{o}\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}} = \pi\_{\mathfrak{o}\mathfrak{o}}(a)\sqrt{A\_i}\mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}} = 0.
$$

for any *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*); by (1.72) this gives <sup>√</sup>*Ai* <sup>=</sup> 0 and therefore *Ai* <sup>=</sup> <sup>√</sup>*Ai* <sup>2</sup> = 0.

Conversely, each state ω*<sup>i</sup>* in (2.80) defines a sesquilinear form *Qi* on *H*<sup>ω</sup> by *Qi*(*a*ω,*b*ω) = ω*i*(*a*∗*b*), which is well defined by ω*i*(*a*∗*a*) ≤ ω(*a*∗*a*) and (A.1), and is positive because ω*<sup>i</sup>* is a state. Proposition A.23 then provides us with a positive operator *Ai* for which *Qi*(*a*ω,*b*ω) = *a*ω,*Aib*ω, hence ω*i*(*a*∗*b*) = *a*ω,*Aib*ω. Next,

$$\langle a\_{\mathfrak{o}\mathfrak{o}}, A\_i \pi\_{\mathfrak{o}}(c) b\_{\mathfrak{o}\mathfrak{o}} \rangle = \langle a\_{\mathfrak{o}\mathfrak{o}}, A\_i (cb)\_{\mathfrak{o}\mathfrak{o}} \rangle = \mathfrak{o} \langle a^\* c b \rangle = \langle (c^\* a)\_{\mathfrak{o}\mathfrak{o}}, A\_i b\_{\mathfrak{o}\mathfrak{o}} \rangle = \langle a\_{\mathfrak{o}\mathfrak{o}}, \pi\_{\mathfrak{o}}(c) A\_i b\_{\mathfrak{o}\mathfrak{o}} \rangle,$$

so *Ai* ∈ πω(*B*(*H*)) . Finally, the bound (2.81) corresponds to 0 ≤ *pi* ≤ 1 in (2.80), whilst ω(1) = 1, or equivalently ∑*<sup>i</sup> pi* = 1, yields (2.82). -

We now complete the proof of Proposition 2.15. We assume (2.33), where we initially take ρ to be invertible. We omit the hat in (2.70) as well as the suffix ω or ρ on vectors. As noted, we then have Ωρ = ρ1/2, and we also know that *Ai* is given by *Aib* = *bai* for some *ai* ∈ *Mn*(C), viz. *ai* = *Ai*1*<sup>n</sup>* (where 1*<sup>n</sup>* = 1*<sup>H</sup>* is to be distinguished from Ωρ <sup>=</sup> <sup>ρ</sup>1/2). In this case, (2.81) means 0 <sup>≤</sup> Tr(*b*∗*bai*) <sup>≤</sup> 1 for each *b* with Tr(*b*∗*b*) = 1, which is true iff 0 ≤ *ai* ≤ 1, whereas (2.82) immediately yields ∑*<sup>i</sup> ai* = 1. In terms of such a family (*ai*) in *Mn*(C) itself, the decomposition (2.80) of ω = Tr(ρ−) into *arbitrary* states ω*<sup>i</sup>* follows from (2.83) - (2.84) as

$$p\_l = \text{Tr}(\rho a\_l);\tag{2.85}$$

$$a\rho(a) = \text{Tr}(\rho\_l a);\tag{2.86}$$

$$\rho\_i = \frac{\rho^{1/2} a\_i \rho^{1/2}}{\text{Tr}(\rho a\_i)}.\tag{2.87}$$

To obtain *pure and orthogonal* states ω*i*, we subsequently ask when the new density matrices ρ*<sup>i</sup>* are mutually orthogonal one-dimensional projections ρ*<sup>i</sup>* = |υ*i*υ*i*|.

To answer this, we use the spectral theorem (A.37) - (A.38) applied to ρ, which gives ρ = ∑*<sup>j</sup> pjej* and hence ρ1/<sup>2</sup> = ∑*<sup>j</sup>* √*pjej*, so that

$$
\mathfrak{p}^{1/2} a\_i \mathfrak{p}^{1/2} = \sum\_{j,k} \sqrt{p\_j p\_k} e\_j a\_i e\_k. \tag{2.88}
$$

This can only be proportional to a one-dimensional projection if each *ai* is a onedimensional projection that commutes with all spectral projections *ej* of ρ (and hence also commutes with ρ itself), and all further constraints on the *ai* may then only be satisfied if *ai* = |υ*i*υ*i*|, for some basis (υ*i*) of eigenvector υ*<sup>i</sup>* of ρ.

A similar analysis applies to non-invertible ρ, the only new point being that projections *ei* orthogonal to the range of ρ fall into the null space *N*<sup>ρ</sup> , cf. (2.76) - (2.79), and hence do not contribute to (2.52), so that they may be ignored. -

#### 2.5 The Born rule from Bohrification

The Bohrification approach to quantum mechanics studies noncommutative algebras of observables like *B*(*H*) through their commutative subalgebras. In this section we show how the Born rule (2.8) emerges from that perspective. Our discussion is based on the interplay between the three kinds of (finite-dimensional) C\*-algebras:


Each of these is *unital*, since *C*(*X*) has a unit 1*<sup>X</sup>* (i.e. the function *x* → 1), *B*(*H*) has a unit 1*<sup>H</sup>* (i.e. the operator ψ → ψ), and *C*∗(*a*) shares the unit 1*H*. The first two classes overlap just in case dim(*H*) = 1 and *X* is a singleton (in which case *B*(C) = *C*(∗) = C); otherwise, the fundamental difference between the two is that *C*(*X*) is *commutative* in that *f g* = *g f* for all *f*,*g*, whereas *B*(*H*) is *non-commutative*. However, the system of C\*-algebras *C*∗(*a*) within *B*(*H*), where *a* ∈ *B*(*H*)sa varies, to some extent bridges the gap between the commutative and the non-commutative worlds. This relatively simple situation goes to the heart of exact Bohrification.

Theorem 2.17. *Let a*<sup>∗</sup> = *a* ∈ *B*(*H*)*, where H is a finite-dimensional Hilbert space.*


$$f(a) = \sum\_{\lambda \in \sigma(a)} f(\lambda) \cdot e\_{\lambda}. \tag{2.89}$$

*gives a (necessarily unital) isomorphism of commutative C\*-algebras*

$$\mathcal{C}(\sigma(a)) \cong \mathbb{C}^{|\sigma(a)|} \cong \mathbb{C}^\*(a). \tag{2.90}$$

*Proof.* Noting that any function on the finite subset σ(*a*) of R is continuous, this is a restatement of Theorem A.15 for finite-dimensional Hilbert spaces. -

We now come to the main point. States on unital C\*-algebras *A* may be defined just as in Definitions 1.14 and 2.5, i.e. as positive linear functionals ω : *A* → C that satisfy ω(1*A*) = 1 (cf. Proposition C.5). Recall Theorem 1.15 and Theorem 2.7.

Theorem 2.18. *Let* ω *be a state on B*(*H*)*, represented by a density operator* ρ *via* (2.33)*, and let a* ∈ *B*(*H*) *be a self-adjoint operator. Then the restriction of* ω *to C*∗(*a*) ⊂ *B*(*H*) *is a state, which also induces a state* ω|*C*(σ(*a*)) *on C*(σ(*a*)) *through* (2.89) *-* (2.90)*, i.e.,* ω|*C*(σ(*<sup>a</sup>*))(*f*) = ω(*f*(*a*))*. The probability measure on* σ(*a*) *that corresponds to the state* ω|*C*(σ(*a*)) *on C*(σ(*a*))*, then, is given by the Born rule* (2.9)*.*

*Proof.* First, the restriction of a state on a given unital C\*-algebra to a unital C\* subalgebra remains a state. Second, isomorphisms of unital C\*-algebras pull back to state spaces in that, if ϕ : *A* → *B* is an isomorphism, and ω is a state on *B*, then ϕ∗ω : *A* → C is a state on *A*, where ϕ∗(*a*) = ω(ϕ(*a*)). We now compute

$$\begin{split} \mathfrak{o}\_{|\mathcal{C}(\sigma(a))}(f) &= \mathfrak{o}(f(a)) = \mathrm{Tr}(\mathfrak{p}f(a)) \\ &= \sum\_{\lambda \in \sigma(a)} \mathrm{Tr}\,(\mathfrak{p}e\_{\lambda})f(\lambda) = \sum\_{\lambda \in \sigma(a)} p\_a(\lambda)f(\lambda) \\ &= E\_{\mathbb{P}\_a}(f), \end{split} \tag{2.91}$$

where, from left to right, the first equality is just the definition of ω|*C*(σ(*a*)), whereas the others in turn follow from (2.33), (2.89), (2.8), and (1.9), respectively. -

Note that Theorem 2.18 implies Theorem 2.2. The simplest nontrivial illustration is:

$$H = \mathbb{C}^n;\tag{2.92}$$

$$
\mathfrak{o} = \mathfrak{o}\_{\Psi};\tag{2.93}
$$

$$\Psi = \sum\_{i=1}^{n} c\_i u\_i;\tag{2.94}$$

$$a = \text{diag}(\lambda\_1, \dots, \lambda\_n) = \sum\_{i=1}^n \lambda\_i |u\_i\rangle\langle u\_i|,\tag{2.95}$$

with respect to the standard basis (*ui*) of <sup>C</sup>*n*, with all <sup>λ</sup>*<sup>i</sup>* <sup>∈</sup> <sup>R</sup> different, cf. (2.42). The C\*-algebra *C*∗(*a*) ∼= C*<sup>n</sup>* then consists of all diagonal matrices

$$b = \text{diag}(b\_1, \dots, b\_n). \tag{2.96}$$

Since obviously

$$\sigma(a) = \{\lambda\_1, \dots, \lambda\_n\},\tag{2.97}$$

the isomorphism (2.90) is given by

$$f \mapsto \text{diag}(f(\lambda\_1), \dots, f(\lambda\_n)). \tag{2.98}$$

The computation (2.91) in the proof of Theorem 2.18 then becomes

$$\mathfrak{so}\_{\mathbb{V}|\mathcal{C}(\sigma(a))}(f) = \langle \Psi, \text{diag}(f(\lambda\_1), \dots, f(\lambda\_n))\Psi \rangle = \sum\_{i=1}^n |c\_i|^2 f(\lambda\_i)$$

$$= \sum\_{i=1}^n p\_a(\lambda\_i) f(\lambda\_i), \tag{2.99}$$

from which the Born probabilities *pa* may be read off as the familiar expressions

$$p\_a(\mathcal{A}\_i) = |c\_i|^2. \tag{2.100}$$

For an analogous treatment of the generalized Born rule (2.21), we first refer to Definition A.16 for the the pertinent definitions, especially of the joint spectrum

$$
\sigma(\underline{a}) \subseteq \sigma(a\_1) \times \dots \times \sigma(a\_n) \subset \mathbb{R}^n
$$

of a family *a* = (*a*1,...,*an*) of commuting self-adjoint operators. As in the case of a single operator, we define *C*∗(*a*) as the smallest unital C\*-subalgebra of *B*(*H*) that contains each *ai*. Generalizing Theorem A.15, we have:

Theorem 2.19. *Let a* = (*a*1,...,*an*) *be commuting self-adjoint operators on H. Then C*∗(*a*) *is commutative, and there is a unique isomorphism of C\*-algebras*

$$\mathcal{C}^\*(\underline{a}) \cong \mathcal{C}(\sigma(\underline{a})),\tag{2.101}$$

*under which* 1*<sup>H</sup>* ∈ *C*∗(*a*) *corresponds to the unit function* 1σ(*a*) : λ → 1 *in C*(σ(*a*))*, and ai* ∈ *C*∗(*a*) *corresponds to the projection* π*<sup>i</sup>* : λ → λ*<sup>i</sup> in C*(σ(*a*))*.*

For further discussion, see Appendix A, Theorem A.17.

Theorem 2.18 may then be generalized in the following way, with similar proof.

Theorem 2.20. *Let* ω *be a state on B*(*H*)*, represented by a density operator* ρ*, and let a* = (*a*1,...,*an*) *be commuting self-adjoint operators on H. Then the restriction of* ω *to C*∗(*a*) ⊂ *B*(*H*) *is a state, which induces a state* ω|*C*(σ(*a*)) *on C*(σ(*a*)) *through the isomorphism* (2.101)*. Then the probability measure on the joint spectrum* σ(*a*) *that corresponds to* ω|*C*(σ(*a*)) *is given by the generalized Born rule* (2.21)*, i.e.,*

$$p\_{\mathfrak{A}}(\underline{\lambda}) = \text{Tr}(\rho e\_{\underline{\lambda}}).\tag{2.102}$$

Strictly speaking, in the present context one should restrict (2.21) to λ ∈ σ(*a*), but the claim is correct even if one does not, for the (Born) probability assigned to values λ ∈ σ(*a*1)×···×σ(*an*) that do not lie in σ(*a*) is simply zero.

As shown in Proposition A.19 in Appendix A, the multi-operator case is a special case of the single-operator case, in that *C*∗(*a*) =*C*∗(*a*) for a suitable self-adjoint operator *a*. Since the converse is obvious, Theorems 2.18 and 2.20 are equivalent. Corollary A.20 in Appendix A even shows that *any* unital commutative C\*-algebra *C* in *B*(*H*) takes the form *C* = *C*∗(*a*) for some self-adjoint operator *a* ∈ *B*(*H*). Comparing the restrictions of a state ω on *B*(*H*) to *C* as the latter varies therefore comes down to asking how the various Born probability distributions *pa* on *C*∗(*a*) are related to each other as *a* varies. It is clear from (2.8) that if *pa* and *pb* come from the same density operator ρ (as the notation indicates), then for λ ∈ σ(*a*) and μ ∈ σ(*b*),

$$e\_{\lambda}^{(a)} = e\_{\mu}^{(b)} \Rightarrow p\_a(\lambda) = p\_b(\mu). \tag{2.103}$$

Indeed, this is the only compatibility condition between *pa* and *pb*, showing that *pa*(λ) only depends on *a* and λ through the associated spectral projection *e* (*a*) <sup>λ</sup> . Condition (2.103) is a version of a general property of quantum mechanics called *noncontextuality*, which in this case means that, given its spectral projection *e* (*a*) <sup>λ</sup> , the 'context' operator *a* is otherwise irrelevant for the Born probability *pa*(λ).

#### 2.6 The Kadison–Singer Problem

It should be clear from the example in the previous section that *pure* states ωψ on *B*(*H*) may well give rise to *mixed* states on *C*∗(*a*); referring to (2.94) and (2.100), this is the case whenever *ci* = 0 for more than one value of the index *i*. If, on the other hand, *ci* = 0 for just a single value *i* = *j*, then ψ = *uj* (up to a phase), or, equivalently, ωψ(*a*) = *uj*,*auj*. In that case, the given state ωψ is pure both on *B*(*H*) and on *C*∗(*a*), and the associated probability measure ωψ|*C*(σ(*a*)) on the spectrum σ(*a*) is supported by a single point, namely λ*<sup>j</sup>* ∈ σ(*a*).

This example suggests a general problem (first posed in the non-trivial case where *H* is infinite-dimensional by Kadison and Singer in 1959) that is of great relevance for the Bohrification program. Namely, let *A* be a maximal commutative unital C\*-algebra in *B*(*H*) and let ω*<sup>A</sup>* be a pure state on *A*. We may then ask:


If dim(*H*) < ∞, all these questions are easy to answer at one stroke:

Theorem 2.21. *Let* dim(*H*) < ∞ *and let* ω*<sup>A</sup> be a pure state on a maximal commutative unital C\*-algebra A in B*(*H*)*. Then* ω*<sup>A</sup> has a unique extension to a state* ω *on B*(*H*)*, which is necessarily pure.*

*Proof.* As explained after the proof of Corollary A.20 in Appendix A, we may simply assume that *H* = C*<sup>n</sup>* and that *A* consists of all diagonal matrices; call this collection *Dn*(C) (for every other case is unitarily equivalent to this one). Clearly,

$$D\_n(\mathbb{C}) \cong \mathbb{C}^n,\tag{2.104}$$

from which we see that if ω*<sup>A</sup>* is pure, then it must be given on *b* ∈ *Dn*(C) by

$$a\_{\mathcal{A}}(b) = b\_{\mathcal{j}},\tag{2.105}$$

for some *j*, cf. (2.96). If ω exists, it is given by (2.33). Using (2.6), condition (2.105) then enforces the following constraint on the *pi* and υ*<sup>i</sup>* (where (*ui*) is the standard basis of C*<sup>n</sup>* and (υ*i*) is an orthonormal set diagonalizing the density operator ρ):

$$\sum\_{i} p\_i |\langle u\_j, \mathfrak{v}\_i \rangle|^2 = 1. \tag{2.106}$$

Since ∑*<sup>i</sup> pi* = 1 and |*uj*,υ*i*| ≤ 1, eq. (2.106) can only hold, for given *j*, if

$$|\langle \mu\_f, \mathfrak{v}\_l \rangle| = 1 \tag{2.107}$$

for all *i* with *pi* > 0. Since *uj* is a unit vector whilst the (υ*i*) are an orthonormal set, (2.107) can only be true if there is a single *i* for which *pi* > 0, namely *i* = *j* (and hence *pj* = 1), in which case υ*<sup>j</sup>* must equal *uj* up to a phase. Hence ρ = |*ujuj*|, which shows that ρ exists, is unique, and is pure. - At least in operational interpretations of quantum mechanics, this theorem implies that a *pure* quantum state (i.e., on *B*(*H*)) is completely determined by the outcome of a measurement of some maximal observable *a*, whose outcome, after all, gives one of the eigenvalues λ*<sup>j</sup>* in (2.95) and hence fixes the post-measurement state to be the one given by (2.105). This is, indeed, a typical way of preparing a state.

As one might expect, this is no longer true if *A* = *C*∗(*a*) fails to be maximal (in which case a measurement of *a* would not provide enough information about the quantum state). Namely, suppose *a* = ∑λ∈σ(*a*) λ · *e*<sup>λ</sup> , as in (A.37); the maximal case occurs iff Tr(*e*<sup>λ</sup> ) = dim(*H*<sup>λ</sup> ) = 1 for all λ ∈ σ(*a*) (equivalently, all eigenvalues λ*<sup>i</sup>* in (A.37) are different). If not, suppose dim(*H*<sup>λ</sup> ) > 1 for some λ. Then any unit vector ψ ∈ *H*<sup>λ</sup> gives rise to a pure state ωψ on *B*(*H*), which remains pure on *A* (it is given by ωψ|*A*(*a*) = λ and hence induces the Dirac probability measure δλ on σ(*a*)).

Dropping the purity condition on ω*<sup>A</sup>* loses uniqueness of the extension ω, too, even if *A* is maximal: take *b* = diag(*b*1,...,*bn*) ∈ *A* = *Dn*(C), and assume that

$$a\mathbf{o}\_A(b) = \sum\_i p\_i b\_i \tag{2.108}$$

has more than one term (with *pi* > 0 and ∑*<sup>i</sup> pi* = 1 as always), cf. (2.105). Then:


Further insight in the state extension problem comes from the following result.

Proposition 2.22. *Let A be any unital C\*-algebra in B*(*H*) *(i.e., A is not necessarily commutative) and let* ω*<sup>A</sup> be a pure state on A. Then the set*

$$\mathcal{S}\_A = \{ \mathcal{o} \in \mathcal{S}(B(H)) \mid \mathcal{o}\_{|A} = \mathcal{o}\_{\mathcal{A}} \}\tag{2.109}$$

*of all states on B*(*H*) *whose restriction* ω|*<sup>A</sup> to A is the given state* ω*A, is a compact convex subspace of the total state space S*(*B*(*H*)) *of B*(*H*)*, whose extreme boundary* ∂*eSA consist of pure states on B*(*H*)*, i.e.,* ∂*eSA* ⊂ *P*(*B*(*H*))*. Consequently,* ω*<sup>A</sup> has a unique extension to a state on B*(*H*) *iff it has a unique pure extension.*

*Proof.* Convexity and (*w*∗) compactness are obvious. Let ω ∈ ∂*eSA* and suppose ω = *t*ω<sup>1</sup> + (1 − *t*)ω<sup>2</sup> for some *t* ∈ (0,1) and ω1,ω<sup>2</sup> ∈ *S*(*B*(*H*)). By assumption, ω*<sup>A</sup>* = ω|*<sup>A</sup>* =*t*ω1|*A*+(1−*t*)ω2|*<sup>A</sup>* is pure on *A*, so ω1|*<sup>A</sup>* = ω2|*<sup>A</sup>* = ω*A*, hence ω1,ω<sup>2</sup> ∈ *SA*. Since ω ∈ ∂*eSA*, this implies ω<sup>1</sup> = ω<sup>2</sup> = ω. Hence ω is pure on *B*(*H*).

Finally, *SA* is a singleton iff its boundary ∂*eSA* is (since any state in *SA* has a convex decomposition in terms of states in its boundary), yielding the last claim. -

This proposition remains true for infinite-dimensional *H* (and even for arbitrary C\*-algebras), but Theorem 2.21 becomes much more complicated. As we shall see, maximal commutative unital C\*-subalgebra of *B*(*H*) are no longer unique up to unitary equivalence, and the validity of the claim depends on which type of maximal subalgebra is considered. Also, the proof of what then is called the *Kadison–Singer Conjecture* becomes extremely difficult (with questionable relevance to physics).

#### 2.7 Gleason's Theorem

Gleason's Theorem answers the following question in the positive: given probability distributions *pa* on σ(*a*), for each self-adjoint operator *a* ∈ *B*(*H*), satisfying (2.103), is there a single state ω on *B*(*H*) inducing these probabilities through the Born rule? This question is closely related to various others that involve equivalent structures, cf. Definition 1.1. We denote the unit sphere in *H* by *H*<sup>1</sup> = {ψ ∈ *H*,ψ = 1}, and write <sup>P</sup>(*H*) = {*<sup>e</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*) <sup>|</sup> *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*} for the set of all projections on *<sup>H</sup>*.

Definition 2.23. *Let H be a finite-dimensional Hilbert space, with unit sphere H*1*.*

*1. A* probability distribution *on* P(*H*) *is a map p* : *H*<sup>1</sup> → [0,1] *that satisfies*

$$\sum\_{l=1}^{\dim H} p(\mathfrak{v}\_l) = 1,\text{ for any basis } (\mathfrak{v}\_l) \text{ of } H. \tag{2.110}$$

*2. A* probability measure *on* P(*H*) *is a map P* : P(*H*) → [0,1] *that satisfies:*

$$P(e+f) = P(e) + P(f)\text{ whenever }e f = 0 \Leftrightarrow e H \perp f H;\tag{2.111}$$

$$P(1\_H) = 1.\tag{2.112}$$

Note that *p* is really defined on P1(*H*), for we have *p*(*z*υ) = *p*(υ) for all *z* ∈ T and υ ∈ *H*1; to see this, extend *z*υ and υ to a basis of *H* in the same way and use (2.110).

As in Definition 1.1, these notions of probability are equivalent, cf. (A.28):

• Given a probability measure *P*, one obtains a probability distribution *p* by

$$p(\mathfrak{v}) = P(e\_{\mathfrak{v}}).\tag{2.113}$$

• Given a probability distribution *p*, Lemma 2.24 below guarantees that

$$P(e) = \sum\_{i=1}^{\dim(eH)} p(\mathcal{v}\_i),\tag{2.114}$$

where (υ*i*) is any basis of *eH*, defines a probability measure *P*.

Lemma 2.24. *If p is a probability distribution on* P(*H*) *and L* ⊂ *H is a linear subspace, with basis* (υ*i*)*, then* ∑dim(*L*) *<sup>i</sup>*=<sup>1</sup> *p*(υ*i*) *is independent of this basis choice.*

*Proof.* Extend (υ*i*) to a basis of *H* by adding a basis (υ *j* ) of *L*⊥. Take another basis (υ *<sup>i</sup>* ) of *L* and complete it to a basis of *H* by using the same basis (υ *j* ) of *L*⊥. Then

$$\sum\_{i} p(\boldsymbol{\upsilon}\_{i}) + \sum\_{j} p(\boldsymbol{\upsilon}\_{j}^{\prime}) = \sum\_{i} p(\boldsymbol{\upsilon}\_{i}^{\prime\prime}) + \sum\_{j} p(\boldsymbol{\upsilon}\_{j}^{\prime}) = 1,\tag{2.115}$$

where we once again used (2.110). Hence ∑*<sup>i</sup> p*(υ*i*) = ∑*<sup>i</sup> p*(υ *<sup>i</sup>* ). - Clearly, a state ω on *B*(*H*) induces a probability measure *P* on P(*H*) by

$$P(e) = \mathfrak{o}(e) = \text{Tr}(\mathfrak{\rho}e),\tag{2.116}$$

where ρ is the density operator associated to ω, as in (2.33). Therefore, it is a natural question if any probability measure on P(*H*) is induced by some state on *B*(*H*) by (2.116). This question is equivalent to the one above:

Proposition 2.25. • *A probability measure P on* P(*H*) *induces non-contextual probability distributions pa on* σ(*a*) *for each self-adjoint a* ∈ *B*(*H*) *by*

$$p\_a(\lambda) = P(e\_{\lambda}^{(a)});\tag{2.117}$$

• *Conversely, a family* (*pa*) *of non-contextual probability distributions (i.e. satisfying* (2.103)*) gives rise to a probability measure P on* P(*H*) *by*

$$P(e) = p\_e(1). \tag{2.118}$$

*Proof.* As defined by (2.117), *pa* is a probability distribution on σ(*a*): by (A.38),

$$\sum\_{\lambda \in \sigma(a)} p\_a(\lambda) = \sum\_{\lambda \in \sigma(a)} P\left(e\_{\lambda}^{(a)}\right) = P\left(\sum\_{\lambda \in \sigma(a)} e\_{\lambda}^{(a)}\right) = P(1\_H) = 1. \tag{2.119}$$

Conversely, suppose *e f* = 0. Introduce *g* = 1−*e*− *f* , and consider the self-adjoint operator *a* = λ1*e*+λ<sup>2</sup> *f* +λ3*g*, for three different real numbers λ1,λ2,λ3. By (2.103),

$$P(e) = p\_e(1) = p\_a(\lambda\_1), \\ P(f) = p\_f(1) = p\_a(\lambda\_2), \\ P(g) = p\_g(1) = p\_a(\lambda\_3).$$

Furthermore, since σ(*a*) = {λ1,λ2,λ3}, we have *pa*(λ1)+ *pa*(λ2)+ *pa*(λ3) = 1 and hence *P*(*e*) +*P*(*f*) +*P*(*g*) = 1. Also, *P*(*e*+ *f*) +*P*(*g*) = *P*(*e*+ *f* +*g*) = *P*(1*H*) = 1. The last two equations give *P*(*e*+ *f*) = *P*(*e*) +*P*(*f*). -

Suppose (*ei*)*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> is a family of projections on *H* such that ∑*<sup>i</sup> ei* = 1*<sup>H</sup>* and *eiej* = δ*i jei*. Such a family generates a commutative unital C\*-algebra *C* = *C*∗(*e*1,..., *eN*) in *B*(*H*), which coincides with *C*∗(*a*) for *a* = ∑*<sup>i</sup>* λ*iei*, where all λ*<sup>i</sup>* ∈ R are different, so that σ(*a*) = {λ1,...,λ*N*}. All commutative unital C\*-algebras in *B*(*H*) arise in this way, and *C* is maximally abelian iff *N* = dim(*H*), i.e., iff each *ei* is onedimensional. The point is that a probability measure *P* on P(*H*) induces a state ω*<sup>C</sup>* on each *C* = *C*∗(*e*1,..., *eN*) (or, for *C* = *C*∗(*a*), a probability measure *Pa* on σ(*a*)):

1. if *a* ∈ *C* is self-adjoint, then we have unique spectral resolutions (A.37), and put

$$\mathfrak{a}\_{\mathbb{C}}(a) = \sum\_{\lambda \in \sigma(a)} \lambda P(e\_{\lambda}). \tag{2.120}$$

2. if *c* = *a*+*ib* ∈ *C* with *a* and *b* self-adjoint, we define ω*C*(*c*) = ω*C*(*a*) +*i*ω*C*(*b*).

By Lemma 2.24, the map ω*<sup>C</sup>* thus defined coincides with the linear extension of the map *ei* → *P*(*ei*) to *C*, which also shows that ω*<sup>C</sup>* in linear. Clearly, ω*<sup>C</sup>* is a state on *C*.

#### 2.7 Gleason's Theorem 61

Again by Lemma 2.24, the ensuing family of states ω*<sup>C</sup>* on all commutative unital C\*-algebras *C* ⊂ *B*(*H*) is *non-contextual* (or, one might say *compatible*) in the sense that if *b* ∈ *C* ∩*C* , then ω*C*(*b*) = ω*<sup>C</sup>*(*b*). In particular, if *C* ⊂ *C*, then ω*C*|*<sup>C</sup>* = ω*<sup>C</sup>* (where ω*C*|*<sup>C</sup>* is the restriction of ω*<sup>C</sup>* to *C* ). It is convenient to extend this noncontextual family (ω*C*) of states to a well-defined map ω : *B*(*H*) → C by putting

$$a\mathfrak{o}(a+ib) = a\mathfrak{o}\_{\mathbb{C}^\*(a)}(a) + ia\mathfrak{o}\_{\mathbb{C}^\*(b)}(b), a, b \in B(H), a^\* = a, b^\* = b. \tag{2.121}$$

Definition 2.26. *A* quasi-state *on B*(*H*) *is a map* ω : *B*(*H*) → C *that is positive (*ω(*a*∗*a*) ≥ 0*) and normalized (*ω(1*H*) = 1*), cf. Definition 2.4, and otherwise:*

*1. satisfies* ω(*a*) = ω(*a* ) +*i*ω(*a*)*, where a* = <sup>1</sup> <sup>2</sup> (*a*+*a*∗) *and a* = −<sup>1</sup> <sup>2</sup> *i*(*a*−*a*∗)*. 2. is linear on each commutative unital C\*-algebra in B*(*H*)*.*

Note that *a* and *a* are self-adjoint, so that ω is fixed by its values on *B*(*H*)sa. Hence we have ω(*za*) = *z*ω(*a*), *z* ∈ C, and ω(*a*+*b*) = ω(*a*) +ω(*b*) whenever *ab* = *ba*.

Proposition 2.27. *The map* ω : *B*(*H*) → C *defined by* (2.120) *and* (2.121) *is a quasistate on B*(*H*)*. Any quasi-state on B*(*H*) *arises in this way, giving a bijective correspondence between quasi-states on B*(*H*) *and probability measures on* P(*H*)*.*

*Proof.* The first claim holds by construction. Conversely, a quasi-state ω yields a probability measure *P* via *P*(*e*) = ω(*e*), cf. (2.116). -

Theorem 1.15 shows that each state on *C*(*X*) is induced by a probability measure (and, trivially, also the other way round). Although Theorem 2.7 is already a quantum version of Theorem 1.15, an even better parallel would involve the probability measures of Definition 2.23. This is indeed what *Gleason's Theorem* achieves, *en passant* answering all versions of our lead question:

Theorem 2.28. *Let H be a finite-dimensional Hilbert space of dimension* > 2*. Then each probability measure P on* P(*H*) *is induced by a unique state* ω *on B*(*H*) *via*

$$P(e) = \mathfrak{o}(e). \tag{2.122}$$

*Equivalently, each probability distribution p on* P(*H*) *is given by*

$$p(\mathfrak{v}) = \langle \mathfrak{v}, \mathfrak{p}\mathfrak{v} \rangle,\tag{2.123}$$

*where* ρ *is a unique density operator on H. Hence every quasi-state is a state.*

This completes the following list (of which 1–5 do not require Gleason's Theorem).

Corollary 2.29. *Let H be a finite-dimensional Hilbert space. The following notions are equivalent (i.e., there are natural bijective correspondence between):*


#### 2.8 Proof of Gleason's Theorem

The difficulty of Theorem 2.28 should already be clear from the fact that it is false if dim(*H*) = 2: as we have seen in (2.37), a state on *M*2(C) = *B*(C2) is given by three real parameters, whereas a probability measure *P* on P(C2) can assign arbitrary values *P*(*e*) to one-dimensional projections *e*, as long as *P*(1−*e*) = 1−*P*(*e*). Equivalently, this time from the perspective of probability distributions *p*, each unit vector in C<sup>2</sup> belongs to a unique basis (up to a phase), so that *p* can assign an arbitrary value to one of the two vectors in each basis and is unconstrained otherwise.

In higher dimensions, however, one-dimensional projections always belong to infinitely many orthogonal sets, whilst unit vectors belong to infinitely many bases. This constrains the possible values *P* or *p* may take, and these constraints turn out to be strong enough to enforce (2.116).

The proof of Theorem 2.28 consists of two nontrivial parts, the second of which is notoriously difficult. By exception in quantum-mechanical reasoning, both involve R<sup>3</sup> as a *real* Hilbert space, whose elements x = (*x*,*y*,*z*) have standard inner product

$$
\langle \mathbf{x}, \mathbf{x}' \rangle = x\mathbf{x}' + yy' + zz', \tag{2.124}
$$

with the ensuing (Pythagorean) norm and (Euclidean) notion of orthogonality.

Proposition 2.30. *If Theorem 2.28 holds for the real Hilbert space* R3*, then it holds for any complex finite-dimensional Hilbert space of dimension* > 2*.*

Proposition 2.31. *Theorem 2.28 holds for the real Hilbert space* R3*.*

Proposition 2.30 is a conjunction of two lemmas.

Lemma 2.32. *If* (2.123) *holds for* R3*, where* ρ *is some symmetric operator, then* (2.123) *holds for* C3*, where* ρ *is a self-adjoint operator.*

Neither positivity nor normalization of ρ play a role in the argument; once we have (2.123) in this more general sense, the conclusion that ρ be a density operator trivially follows from the definition of *p*. This also applies to the second sublemma.

Lemma 2.33. *If* (2.123) *holds for* C3*, then it holds for for any complex finitedimensional Hilbert space of dimension* > 2*.*

It will be convenient to extend *p* : *H*<sup>1</sup> → [0,1] to a function *Q* : *H* → R by

$$\mathcal{Q}(0) = 0;\tag{2.125}$$

$$\mathcal{Q}(\Psi) = ||\Psi||^2 p\left(\frac{\Psi}{||\Psi||}\right) \ (\Psi \neq 0),\tag{2.126}$$

so that (2.123) is evidently equivalent to the analogous expression

$$\mathcal{Q}(\Psi) = \langle \Psi, \mathcal{p}\Psi \rangle \ (\Psi \in H). \tag{2.127}$$

Given (2.127), the minimax principle for real symmetric matrices implies that *Q* is maximized on *H*<sup>1</sup> by ψ ∈ *H*<sup>1</sup> iff ρψ = λψ, where λ is the largest eigenvalue of ρ.

*Proof of Lemma 2.32.* Suppose *p* : C<sup>3</sup> <sup>1</sup> → [0,1] is a probability distribution (in the sense of Definition 2.23). The first step shows that *p* assumes a maximum on the unit sphere C<sup>3</sup> <sup>1</sup> (note that C<sup>3</sup> <sup>1</sup> is compact, but we do not know yet if *p* is continuous!). Since 0 <sup>≤</sup> *<sup>p</sup>*(υ) <sup>≤</sup> 1 for <sup>υ</sup> <sup>∈</sup> <sup>C</sup><sup>3</sup> <sup>1</sup>, *<sup>M</sup>* <sup>=</sup> sup{*p*(υ),<sup>υ</sup> <sup>∈</sup> <sup>C</sup><sup>3</sup> <sup>1</sup>} exists, and there is a sequence (υ*n*) in C<sup>3</sup> <sup>1</sup> for which *<sup>p</sup>*(υ*n*) <sup>→</sup> *<sup>M</sup>*. Since <sup>C</sup><sup>3</sup> <sup>1</sup> is compact, this sequence has a convergent subsequence, with limit <sup>υ</sup><sup>∞</sup> <sup>∈</sup> <sup>C</sup><sup>3</sup> <sup>1</sup>. Furthermore, we may assume that υ*n*,υ∞ ∈ R, for if not, we change to υ *<sup>n</sup>* = *zn*υ*<sup>n</sup>* with *zn* = υ∞,υ*n*/|υ*n*,υ∞|.

For each fixed *n* (with υ*<sup>n</sup>* in the convergent subsequence in question), the real linear span of υ<sup>∞</sup> and υ*<sup>n</sup>* is isomorphic to R<sup>2</sup> as a Hilbert space (with standard inner product), embedded in any <sup>R</sup><sup>3</sup> <sup>⊂</sup> <sup>C</sup><sup>3</sup> one likes (where, once again, <sup>R</sup><sup>3</sup> is seen as a real Hilbert subspace in the sense that all inner products of vectors in R<sup>3</sup> are real). By assumption, (2.123) holds on <sup>R</sup><sup>3</sup> and hence also on <sup>R</sup><sup>2</sup> <sup>⊂</sup> <sup>R</sup>3, so that, in particular,

$$\begin{split} |p(\mathfrak{u}\_{\infty}) - p(\mathfrak{v}\_{\mathfrak{n}})| &= |\langle \mathfrak{u}\_{\infty}, \mathfrak{p}\mathfrak{v}\_{\infty} \rangle - \langle \mathfrak{v}\_{\mathfrak{n}}, \mathfrak{p}\mathfrak{v}\_{\mathfrak{n}} \rangle| = |\langle (\mathfrak{v}\_{\infty} - \mathfrak{v}\_{\mathfrak{n}}), \mathfrak{p}(\mathfrak{v}\_{\infty} + \mathfrak{v}\_{\mathfrak{n}}) \rangle| \\ &\leq ||\mathfrak{p}|| ||\mathfrak{v}\_{\infty} + \mathfrak{v}\_{\mathfrak{n}}|| ||\mathfrak{v}\_{\infty} - \mathfrak{v}\_{\mathfrak{n}}|| \leq 2||\mathfrak{p}|| ||\mathfrak{v}\_{\infty} - \mathfrak{v}\_{\mathfrak{n}}||, \end{split}$$

since υ<sup>∞</sup> +υ*n*≤υ∞+υ*n* and υ∞ = υ*n* = 1. Consequently,

$$|p(\mathfrak{v}\_{\infty}) - M| \le |p(\mathfrak{v}\_{\infty}) - p(\mathfrak{v}\_{n})| + |p(\mathfrak{v}\_{n} - M) \le 2||p|| ||\mathfrak{v}\_{\infty} - \mathfrak{v}\_{n}|| + |p(\mathfrak{v}\_{n}) - M|,$$

so letting *n* → ∞ makes both terms on the right-hand side vanish. Hence *p*(υ∞) = *M*.

For reasons to become clear soon, we relabel <sup>υ</sup><sup>∞</sup> <sup>≡</sup> <sup>υ</sup>1. Take any <sup>υ</sup><sup>0</sup> <sup>∈</sup> <sup>C</sup><sup>3</sup> <sup>1</sup> with υ0,υ1 <sup>=</sup> 0 and consider the *real* Hilbert space <sup>R</sup><sup>2</sup> <sup>⊂</sup> <sup>C</sup><sup>3</sup> spanned by <sup>υ</sup><sup>1</sup> and <sup>υ</sup>0. By assumption, (2.127) holds, and by the minimax principle, ρυ<sup>1</sup> = λ1υ<sup>1</sup> = *p*(υ1)υ1, with *p*(υ1) = *M*. Hence for any υ = *t*0υ<sup>0</sup> +*t*1υ1, with *t*0,*t*<sup>1</sup> ∈ R, we have

$$\mathcal{Q}(\mathfrak{v}) = \langle t\_0 \mathfrak{v}\_0 + t\_1 \mathfrak{v}\_1, \mathfrak{p}(t\_0 \mathfrak{v}\_0 + t\_1 \mathfrak{v}\_1) \rangle = |t\_0|^2 p(\mathfrak{v}\_0) + |t\_1|^2 p(\mathfrak{v}\_1). \tag{2.128}$$

We claim that this also holds for *complex* coefficients *t*0,*t*<sup>1</sup> ∈ C. Indeed, by (2.126),

$$\mathcal{Q}(t\_0\mathfrak{u}\_0 + t\_1\mathfrak{v}\_1) = |t\_1|^2 \mathcal{Q}\left(\frac{|t\_0|}{|t\_1|} \frac{|t\_1|}{|t\_0|} \frac{t\_0}{t\_1} \mathfrak{v}\_0 + \mathfrak{v}\_1\right) = |t\_0|^2 p(\mathfrak{v}\_0) + |t\_1|^2 p(\mathfrak{v}\_1), \tag{2.129}$$

where we used (2.128) with υ <sup>0</sup> = (*t*0/*t*1)/|(*t*0/*t*1)|υ<sup>0</sup> instead of υ0; this is still a vector orthogonal to υ1, and we also used *Q*(υ <sup>0</sup>) = *p*(υ <sup>0</sup>) = *p*(υ0).

We now repeat this analysis on the part (C<sup>3</sup> <sup>1</sup>)⊥υ<sup>1</sup> of <sup>C</sup><sup>3</sup> <sup>1</sup> that consists of all unit vectors orthogonal to υ1, which remains compact. Thus *p* assumes a maximum at some unit vector <sup>υ</sup><sup>2</sup> <sup>∈</sup> (C<sup>3</sup> <sup>1</sup>)⊥υ<sup>1</sup> , and we may complete the pair (υ1,υ2) to a basis (υ1,υ2,υ3) of C3. With υ<sup>0</sup> = *t*2υ<sup>2</sup> +*t*3υ3, the above argument (on (C<sup>3</sup> <sup>1</sup>)⊥υ<sup>1</sup> ) gives

$$p(\mathfrak{v}\_0) = \mathcal{Q}(\mathfrak{v}\_0) = |t\_2|^2 p(\mathfrak{v}\_2) + |t\_3|^2 p(\mathfrak{v}\_3). \tag{2.130}$$

Combined with (2.129) at *t*<sup>0</sup> = 1, this gives, for any coefficients *t*1,*t*2,*t*<sup>3</sup> ∈ C,

$$\mathcal{Q}(t\_1\mathfrak{v}\_1 + t\_2\mathfrak{v}\_2 + t\_3\mathfrak{v}\_3) = |t\_1|^2 p(\mathfrak{v}\_1) + |t\_2|^2 p(\mathfrak{v}\_2) + |t\_3|^2 p(\mathfrak{v}\_3). \tag{2.131}$$

Hence (2.127) holds on all of C3, with

$$\rho = p(\mathfrak{v}\_1)|\mathfrak{v}\_1\rangle\langle\mathfrak{v}\_1| + p(\mathfrak{v}\_2)|\mathfrak{v}\_2\rangle\langle\mathfrak{v}\_2| + p(\mathfrak{v}\_3)|\mathfrak{v}\_3\rangle\langle\mathfrak{v}\_3|.\tag{7}$$

*Proof of Lemma 2.33.* Let *H* be a complex finite-dimensional Hilbert space of dimension ≥ 3, equipped with a probability distribution *p*, and define *Q* : *H* → R by (2.125) - (2.126). We need to prove (2.127) for some self-adjoint operator ρ. By Propositions A.4 and A.23, this is equivalent to *Q* being a quadratic form. Since (A.8) evidently holds, we just need to prove (A.9). Take any three-dimensional Hilbert space *L*<sup>3</sup> ⊂ *H* containing *v* and *w*. By assumption, there exists a self-adjoint operator ρ*L*<sup>3</sup> on *L*<sup>3</sup> for which (2.127) is valid for all ψ ∈ *L*3. Taking ψ = *v*, ψ = *w*, ψ = *v*+*w*, and ψ = *v*−*w* then validates (A.9). This completes the first proof.

This lemma may also be proved without invoking Proposition A.4, as follows.

If *v* and *w* are linearly independent, they are contained in a unique two-dimensional subspace *L*<sup>2</sup> ⊂ *H*, which in turn is contained in a (non-unique) three-dimensional subspace *L*<sup>3</sup> ⊂ *H*. Take ρ*L*<sup>3</sup> as above and define a bilinear form *B* on *L*<sup>2</sup> by *B*(*v*,*w*) = *v*,ρ*L*3*w*. Defining the associated quadratic form *Q* by (A.7), we see that (2.125) - (2.126) hold, from which we also conclude that *B* is independent of the choice of *L*<sup>3</sup> ⊃ *L*2. If *v* and *w* are linearly dependent, a similar argument shows that *B* is independent of the choice of the subspace *L*<sup>2</sup> containing *v* and *w*. Hence *B* : *H* × *H* → C is well defined, and to conclude that it is a self-adjoint form we need to check that *B*(*v*,λ*w* + *x*) = λ*B*(*v*,*w*) + *B*(*v*, *x*) for all *v*,*w*, *x* ∈ *V*, λ ∈ C, cf. Definition A.1. If *v*,*w*, and *x* are linearly independent, this can be done by passing to the unique three-dimensional subspace *L* <sup>3</sup> ⊂ *H* containing these vectors. If they are not, we are already done by the previous step. Finally, given that *B* is a bilinear form, a self-adjoint operator ρ may be reconstructed from Proposition A.23, upon which (2.127) holds by construction. -

Proposition 2.31 again follows from two lemmas by *modus ponens*.

Lemma 2.34. *Any probability distribution on* R<sup>3</sup> *(vf. Definition 2.23) is continuous.*

Lemma 2.35. *Any continuous probability distribution in* R<sup>3</sup> *satisfies* (2.127)*, for some self-adjoint operator* ρ*.*

The operator ρ obtained by Lemma 2.35 is necessarily positive and automatically has unit trace. Another way to phrase this is to take the complex linear span of all probability distribution on the unit sphere R<sup>3</sup> <sup>1</sup> = *S*<sup>2</sup> in R3; this yields a vector space F(*S*2), whose elements are called *frame functions*. These are *bounded* functions

$$f: \mathbb{S}^2 \to \mathbb{C},$$

with the property that for any basis (u1,u2,u3) of R<sup>3</sup> one has

$$f(\mathbf{u}\_1) + f(\mathbf{u}\_2) + f(\mathbf{u}\_3) = \mathbf{w}(f),\tag{2.132}$$

where *w*(*f*) ∈ C does not depend on the basis and is called the *weight* of the frame function *f* . For a probability distribution *p* we obviously have *w*(*p*) = 1. The natural norm on F(*S*2) is the supremum-norm inherited from *C*(*S*2), and like the latter, F(*S*2) is closed in this norm (and hence is a Banach space in its own right, a fact that will play an important technical role in Lemma 2.40 below).

As for probability distributions, (2.132) implies a lemma that will often be used:

Lemma 2.36. *If* (u1,u2) *is a basis of some two-dimensional linear subspace of* R3*, then f*(u1)+ *f*(u2) *is independent of the choice of this pair. Hence if C is some great circle in S*<sup>2</sup> *and* <sup>u</sup><sup>1</sup> <sup>⊥</sup> <sup>u</sup><sup>2</sup> *for* <sup>u</sup>1,u<sup>2</sup> <sup>∈</sup> *C, then f*(u1) + *<sup>f</sup>*(u2) *only depends on C.*

Furthermore, by similar arguments any frame function is even, i.e., *f*(−u) = *f*(u).

The proof of Lemma 2.34 will actually show that every frame function on *S*<sup>2</sup> is continuous, whilst the proof of Lemma 2.35 will establish the property that any continuous frame function on *S*<sup>2</sup> satisfies (2.127), for some self-adjoint operator ρ. *Proof of Lemma 2.34.* Let *<sup>f</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> <sup>R</sup> be a frame function (the complex-valued case follows by decomposing *f* into a real and an imaginary part). Since constants are frame functions, adding a constant to *f* if necessary we may assume

$$\inf \{ f(\mathbf{x}), \mathbf{x} \in S^2 \} = 0. \tag{2.133}$$

Hence for given <sup>ε</sup> <sup>&</sup>gt; 0 there exists <sup>p</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> with

$$f(\mathbf{p}) < \mathfrak{e}/2. \tag{2.134}$$

Performing a rotation if necessary, we may assume that p = (0,0,1) is the north pole. It is useful to introduce another frame function *<sup>g</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> <sup>R</sup><sup>+</sup> by

$$g(\mathbf{x}) = f(\mathbf{x}) + f(R\_{\varepsilon}(\pi/2)\mathbf{x}),\tag{2.135}$$

where *Rz*(π/2) is the (counter-clockwise) rotation around the *z*-axis by an angle π/2. It is easy to see that *g* is constant on the equator *E*: for x ∈ *E*, consider the basis (x,*Rz*(π/2)x,p) of <sup>R</sup>3, so that *<sup>g</sup>*(x) = *<sup>w</sup>*(*f*)<sup>−</sup> *<sup>f</sup>*(p) is independent of <sup>x</sup>.

Furthermore, for any *<sup>U</sup>* <sup>⊂</sup> *<sup>S</sup>*<sup>2</sup> consider the *oscillation* of *<sup>f</sup>* at *<sup>U</sup>*, defined by

$$\text{Osc}\_U(f) = \sup\_U(f) - \inf\_U(f) \equiv \sup\{f(\mathbf{u}), \mathbf{u} \in U\} - \inf\{f(\mathbf{u}), \mathbf{u} \in U\}. \tag{2.136}$$

If, for given <sup>x</sup> <sup>∈</sup> *<sup>S</sup>*2, for any <sup>ε</sup> <sup>&</sup>gt; 0 there is a neighbourhood *<sup>U</sup>* <sup>⊂</sup> *<sup>S</sup>*<sup>2</sup> of <sup>x</sup> on which Osc*<sup>U</sup>* (*f*) < ε, then | *f*(x)− *f*(u)| < ε for all u ∈ *U*, so that *f* is continuous at x.

The lengthier steps in the proof of Lemma 2.34 are now as follows:

Lemma 2.37. *Given that g*(p) <sup>&</sup>lt; <sup>ε</sup>*, there is an open set U* <sup>⊂</sup> *<sup>S</sup>*<sup>2</sup> *on which*

$$\text{Osc}\_U(\mathfrak{g}) < \mathfrak{z}\mathfrak{e}\_\bullet$$

Lemma 2.38. *For any non-negative frame function h, if* Osc*<sup>U</sup>* (*h*) ≤ ε *for some open U, then each point* <sup>x</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> *has a neighborhood V where*

$$\text{Osc}\_V(h) \le 4\mathfrak{c}'.$$

Assuming these lemmas (to be proved below), continuity of *f* easily follows:

1. Lemmas 2.37 and 2.38 applied to *h* = *g* and x = p yield Osc*<sup>V</sup>* (*g*) < 12ε for some neighbourhood *V* of p. Now *g*(p) < ε, hence inf{*g*(v),v ∈ *V*} < ε, hence

$$
\sup\_V(f) \le \sup\_V(g) \le \text{Osc}\_V(g) + \inf\_V(g) < 1 \,\mathfrak{E}\,.
$$


For p = u ∈ *N*, i.e., the open northern hemisphere, let *C*<sup>u</sup> be the unique great circle through u with one (and hence both) of the following equivalent properties:


We write *D*<sup>u</sup> = *C*<sup>u</sup> ∩*N*, and for each z ∈ *N*, we introduce the set

$$DD\_{\mathbf{z}} = \{ \mathbf{x} \in N \mid \exists \mathbf{y} \in D\_{\mathbf{x}}, \mathbf{z} \in D\_{\mathbf{y}} \}. \tag{2.137}$$

Geometrically, *DD*<sup>z</sup> consists of the points x on the northern hemisphere from which z can be reached by "double descent", where we say that y ∈ *N* may be reached from some point x at higher latitude by (single) descent if y ∈ *C*x. The proof of our lemmas relies on the following two facts from spherical geometry (stated without proof, as they have nothing to do with frame functions, though the second is easy).

Lemma 2.39. *1. The set DD*<sup>z</sup> *in* (2.137) *has open interior.*

*2. For any* <sup>x</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> *there exists* <sup>y</sup> <sup>∈</sup> *E such that* <sup>x</sup> *lies on the equator E*<sup>y</sup> *relative to* <sup>y</sup> *regarded as the north pole (so in this terminology, E* = *E*p*).*

*Proof of Lemma 2.37.* By definition of the infimum, for each ε > 0 there exists z ∈ *N* such that

$$\inf\_{N} \mathcal{g} \le \mathcal{g}(\mathbf{z}) \le \inf\_{N} \mathcal{g} + \mathfrak{e}.\tag{2.138}$$

The open *U* in question will be the interior of *DD*z. The crucial inequality is

$$g(\mathbf{x}) < g(\mathbf{z}) + 2\varepsilon \ (\mathbf{x} \in DD\_{\mathbf{z}}),\tag{2.139}$$

which together with (2.138) yields inf*<sup>N</sup> g* ≤ *g*(x) ≤ inf*<sup>N</sup> g* + 3ε for each x ∈ *DD*z, whence Osc*<sup>U</sup>* (*g*) ≤ 3ε. So we need to prove (2.139), given the assumption *g*(p) < ε, which is immediate from (2.134) and (2.135).

To prove (2.139), take r ∈ *N* and s ∈ *C*<sup>r</sup> ∩*E*, so r ⊥ s and hence

$$\mathbf{g(r)} + \mathbf{g(s)} \le \mathbf{w(g)}.\tag{2.140}$$

Furthermore, take t,u ∈ *E*, t ⊥ u, so that (t,u,p) is a basis and, *g* being a frame function, we have

$$
\mathbf{g(t)} + \mathbf{g(u)} + \mathbf{g(p)} = \mathbf{w(g)}.\tag{2.141}
$$

But by construction *g* is constant on the equator *E*, so *g*(t) = *g*(u) = *k*, hence 2*k* + *g*(p) = *w*(*g*), and (2.140) yields

$$\mathbf{g(r)} \le \mathbf{w(g)} - \mathbf{g(s)} = 2k + \mathbf{g(p)} - \mathbf{g(s)} = k + \mathbf{g(p)},$$

from which

$$k - \mathbf{g}(\mathbf{r}) \ge -\mathbf{g}(\mathbf{p}).\tag{2.142}$$

Furthermore, for q ∈ *N*, x, r ∈ *D*q, x ⊥ r, there exists q ∈ *D*<sup>q</sup> ∩*E* such that

$$
\lg(\mathbf{x}) + \lg(\mathbf{r}) = \lg(\mathbf{q}) + \lg(\mathbf{q'}) = \lg(\mathbf{q}) + k,
$$

from which, using (2.142), we obtain

$$\mathbf{g(x) = g(q) + k - g(r) \ge g(q) - g(p)},$$

and hence

$$\mathbf{g(q)} \le \mathbf{g(x)} + \mathbf{g(p)}, \mathbf{q} \in N, \mathbf{x} \in D\_{\mathbf{q}}.\tag{2.143}$$

Aplying this twice to the double descent definition domain (2.137), we find

$$g(\mathbf{x}) \le g(\mathbf{y}) + g(\mathbf{p}) \le g(\mathbf{z}) + 2g(\mathbf{p}), \mathbf{y} \in D\_{\mathbf{x}}, \mathbf{z} \in D\_{\mathbf{y}}.\tag{2.144}$$

Since (2.134) and (2.135) imply *g*(p) < ε, this yields (2.139). -

*Proof of Lemma 2.38.* We may assume p ∈ *U* ≡ *U*p. Using Lemma 2.39.2, by the argument to come we then move *U*<sup>p</sup> to a neighborhood of y called *U*y, and subsequently repeat the argument so as to move *U*<sup>y</sup> to *U*<sup>x</sup> ≡ *V* as specified in the lemma.

We use spherical coordinates (φ,θ) for <sup>x</sup> = (*x*, *<sup>y</sup>*,*z*) <sup>∈</sup> *<sup>S</sup>*2, given by

$$\mathbf{x} \cdot (\mathbf{x} = \cos \phi \sin \theta, \mathbf{y} = \sin \phi \sin \theta, \mathbf{z} = \cos \theta), \phi \in [0, 2\pi), \theta \in [0, \pi]. \tag{2.145}$$

Hence the north pole p = (0,0,1) has θ = 0 and φ undefined (note that (φ,θ) are essentially (longitude, latitude), except that the latter usually starts counting downwards from <sup>1</sup> <sup>2</sup>π to −<sup>1</sup> <sup>2</sup>π, with the north pole having latitude <sup>1</sup> <sup>2</sup>π). Since *U* is open, there exists δ > 0 such that all points with 0 ≤ θ < δ belong to *U*. Pick y ∈ *E* as above, and define r as the point with the same φ as y but θ<sup>r</sup> = θ<sup>y</sup> + <sup>1</sup> <sup>2</sup> δ (so that r lies a little south of y). Then inspection of *S*<sup>2</sup> shows that one can find a neighborhood *U*<sup>y</sup> of y with the following property: for any u ∈ *U*<sup>y</sup> there exists a great circle *C* through r and u that contains two further points r ∈ *U*<sup>p</sup> and u ∈ *U*<sup>p</sup> such that r ⊥ r and u ⊥ u . Hence *h*(r) +*h*(r ) = *h*(u) +*h*(u ). Doing this for two different points u = u<sup>1</sup> and u = u<sup>2</sup> gives

$$\begin{aligned} h(\mathbf{r}) + h(\mathbf{r}\_1') &= h(\mathbf{u}\_1) + h(\mathbf{u}\_1'); \\ h(\mathbf{r}) + h(\mathbf{r}\_2') &= h(\mathbf{u}\_2) + h(\mathbf{u}\_2'). \end{aligned}$$

Hence *h*(u1)−*h*(u2) = *h*(r 1)−*h*(r 2)−(*h*(u 1)−*h*(u 2)), from which we obtain

$$|h(\mathbf{u}\_1) - h(\mathbf{u}\_2)| \le |h(\mathbf{r}\_1') - h(\mathbf{r}\_2')| + |(h(\mathbf{u}\_1') - h(\mathbf{u}\_2'))| \le \text{Osc}\_U(h) + \text{Osc}\_U(h) \le 2\varepsilon',$$

for by assumption, Osc*<sup>U</sup>* (*h*) ≤ ε . Since u<sup>1</sup> and u<sup>2</sup> in *U*<sup>y</sup> were arbitrary, this gives

$$\text{Osc}\_{U\_\text{Y}}(h) \le 2\mathfrak{e}'.\tag{2.146}$$

Repeating this with y as the north pole gives Osc*U*<sup>x</sup> (*h*) ≤ 4ε , i.e., the lemma. -

To prove Lemma 2.35, following Gleason himself we consider the natural action of the rotation group *SO*(3) (with positive determinant) on <sup>R</sup>3, written *<sup>R</sup>* : <sup>x</sup> → *<sup>R</sup>*x. This action maps *S*<sup>2</sup> onto itself and hence induces an action *U* on*C*(*S*2) by pullback:

$$U(R)f(\mathbf{u}) = f(R^{-1}\mathbf{u}).\tag{2.147}$$

By Lemma 2.34 we have inclusions

$$\mathcal{A}\mathcal{P}(\mathbb{S}^2) \subset \mathcal{C}\_\epsilon(\mathbb{S}^2) \subset \mathcal{C}(\mathbb{S}^2),\tag{2.148}$$

where F(*S*2) are the frame functions and *Ce*(*S*2) consists of the even functions in *C*(*S*2); both spaces are obviously stable under the action (2.147). The following facts, due to Weyl, which we state without proof, follow from elementary representation theory, but they are also quite easily verified by explicit computation. Let

$$
\Psi \rho(\mathbf{x}, \mathbf{y}, \mathbf{z}) = (\mathbf{x} + i\mathbf{y})^\ell, \ell \in \mathbb{N}, \tag{2.149}
$$

and restrict this function to *S*2, still calling it ψ-. Let *H*- <sup>⊂</sup>*C*(*S*2) be the vector space spanned by all transforms *U*(*R*)ψ-, *R* ∈ *SO*(3). This vector space:


Indeed, all (necessarily finite-dimensional) irreducible representations of *SO*(3) arise in this way. Now F(*S*2) is closed under the *SO*(3)-action (2.147), hence so must be <sup>F</sup>(*S*2)∩*H*-. Since *H*is irreducible, there are merely two possibilities:

$$H\_{\ell} \subset \mathcal{F}(\mathbb{S}^{2});\tag{2.150}$$

$$H\_{\ell} \cap \mathcal{F}(\mathbb{S}^2) = \{0\}. \tag{2.151}$$

Since for even/odd values of the space *H* consist of even/odd functions, and F(*S*2) only has even elements, we immediately see that (2.151) applies if is *odd*. For *even* values of -, we see at once that (2.150) holds for:


The latter functions are induced by operators ρ with zero trace. To see this, diagonalize ρ in C<sup>3</sup> as in (2.6), without the constraints on *pi*. This yields

$$f(\mathbf{x}) = \langle \mathbf{x}, \rho \mathbf{x} \rangle = \sum\_{l=1}^{3} p\_l |\langle \mathbf{x}, \mathbf{u}\_l \rangle|^2. \tag{2.152}$$

For *<sup>f</sup>* <sup>∈</sup> *<sup>H</sup>*2, since *<sup>H</sup>*<sup>2</sup> <sup>⊥</sup> *<sup>H</sup>*<sup>0</sup> in *<sup>L</sup>*2(*S*2) we must have

$$\langle \mathbf{1}\_{\mathbb{R}^3}, f \rangle\_{L^2(\mathbb{S}^2)} = \int\_{\mathcal{S}^2} d^2 \mathbf{x} \, f(\mathbf{x}) = \mathbf{0}. \tag{2.153}$$

For any <sup>υ</sup> <sup>∈</sup> <sup>C</sup>3, we have

$$\int\_{S^2} d^2 \mathbf{x} \, |\langle \mathbf{x}, \mathbf{v} \rangle|^2 = \frac{4\pi}{3} \|\mathbf{v}\|^2;\tag{2.154}$$

to see this, write |x,υ|<sup>2</sup> <sup>=</sup> <sup>|</sup>υ*x*<sup>|</sup> <sup>2</sup>*x*<sup>2</sup> <sup>+</sup>|υ*y*<sup>|</sup> <sup>2</sup>*y*<sup>2</sup> <sup>+</sup>|υ*z*<sup>|</sup> <sup>2</sup>*z*2, and use the surface element *d*2x = *d*φ*d*θ sinθ associated to the spherical coordinates (2.145) to compute

$$\int\_{S^2} d^2 \mathbf{x} \mathbf{x}^2 = \int\_{S^2} d^2 \mathbf{x} \mathbf{y}^2 = \int\_{S^2} d^2 \mathbf{x} z^2 = \frac{4\pi}{3}. \tag{2.155}$$

Therefore, from (2.152), noting that υ*i*<sup>2</sup> <sup>=</sup> 1 for each *<sup>i</sup>* <sup>=</sup> <sup>1</sup>,2,3, we obtain

$$\int\_{S^2} d^2 \mathbf{x} \, f(\mathbf{x}) = \frac{4\pi}{3} \sum\_{l=1}^3 p\_l = \frac{4\pi}{3} \text{Tr}\,(\boldsymbol{\rho}).\tag{2.156}$$

To settle the case - ≥ 4, all we need to know about the spherical harmonics is that if is even, then, once again using spherical coordinates, one has

$$Y\_{\ell}^{m}(\mathbf{x}, \mathbf{y}, z=0) \sim e^{im\phi} \text{ ( $m$  even)};\tag{2.157}$$

$$Y\_{\ell}^{m}(\mathbf{x}, \mathbf{y}, z=0) = \mathbf{0} \,(m \text{ odd}).\tag{2.158}$$

If (2.150) holds, then *Y <sup>m</sup>* - <sup>∈</sup> <sup>F</sup>(*S*2) for each *<sup>m</sup>* <sup>=</sup> <sup>−</sup>-,−-+1,...,-−1, -. But for any (even) - <sup>≥</sup> 4, there are values of *<sup>m</sup>* for which *<sup>Y</sup> <sup>m</sup>* cannot be a frame function. To see this, take the following family of bases of R3, indexed by φ:

$$\mu\_{\mathbb{I}} = (\cos \phi, \sin \phi, 0);\tag{2.159}$$

$$\mu\_2 = (-\sin\phi, \cos\phi, 0);\tag{2.160}$$

$$
\mu\_{\mathcal{B}} = (0,0,1). \tag{2.161}
$$

For any frame function *f* , the value of *f*(*u*1)+ *f*(*u*2) = *w*(*f*)− *f*(*u*3) must therefore be independent of φ. However, from (2.157) - (2.158), we find

$$Y\_{\ell}^{m}(\mu\_1) + Y\_{\ell}^{m}(\mu\_2) \sim e^{im\phi} + e^{im(\phi + \pi/2)} = e^{im\phi}(1 + i^m),$$

which is independent of φ iff *m* = 0 or *m* = 2 (mod 4). For - = 0,2 these are indeed the only values that occur, but as soon as - ≥ 4, the value *m* = 4 (among others) will ruin it. So (2.150) holds only for - = 0 and - = 2, whereas (2.151) is the case for all other - <sup>∈</sup> <sup>N</sup>. Since *<sup>H</sup>*<sup>0</sup> and *<sup>H</sup>*<sup>2</sup> occur in *<sup>C</sup>*(*S*2) with multiplicity one, they cannot have greater multiplicity in <sup>F</sup>(*S*2) <sup>⊂</sup> *<sup>C</sup>*(*S*2), so the above argument suggests that

$$\mathcal{P}(\mathbb{S}^2) = H\_0 \oplus H\_2,\tag{2.162}$$

which would prove the lemma. Fortunately, this is indeed the case, but to complete the argument we need the following technical results (left out by Gleason himself):

#### Lemma 2.40. *1. Frame functions are uniformly continuous.*


$$||U(R\_n)(f\_m - f)||\_{\infty} = ||f\_m - f||\_{\infty},$$

we obtain the estimate

$$\|\|U(\mathcal{R}\_n)f\_m - U(\mathcal{R})f\|\|\_{\infty} \le \|\|f\_m - f\|\|\_{\infty} + \|\|U(\mathcal{R}\_n)f - U(\mathcal{R})f\|\|\_{\infty},$$

cf. (2.147). As *m* → ∞ the first term on the right-hand side vanishes by assumption, whilst the second vanishes as *n* → ∞ by uniform continuity of *f* .

3. This is a Banach space version of the Peter–Weyl theorem, applied to the Banach space of frame functions equipped with the supremum-norm (see Notes). -

Something like this is necessary, because one needs to rule out the possibility that although (by the Stone–Weierstrass Theorem) the polynomial functions on R3, restricted to *S*2, are uniformly dense in *C*(*S*2), so that the linear span of all spherical harmonics and hence of all *H* is uniformly dense in *C*(*S*2), some frame functions might lie in the closure of this direct sum (or, in other words, they are given by uniformly convergent infinite sums of certain *Y <sup>m</sup>* - ). Lemma 2.40 clinches the proof of (2.162), since the third part implies that F(*S*2) would contain all irreducible representations that contribute to the potential infinite sums; but we have already proved that it only contains *H*<sup>0</sup> and *H*2. Thus Lemma 2.35 now also follows. -

#### 2.9 Effects and Busch's Theorem

Gleason's Theorem is easy to state but difficult to prove; *Busch's Theorem* is a variation of it, which is more difficult to state but much easier to prove. Logically, Busch's Theorem is weaker than Gleason's, as the assumptions of the latter are contained in those of the former, but physically it appears to be more useful, as it covers more situations. To wit, Busch's Theorem revolves around certain generalizations of projections (which took the centre stage in Gleason's Theorem) called *effects*: these are (necessarily self-adjoint) operators *a* ∈ *B*(*H*) that satisfy 0 ≤ *a* ≤ 1*H*, in the sense defined after Proposition A.22. Thus *a* ∈ *B*(*H*) is an effect iff

$$0 \le \langle \Psi, a\Psi \rangle \le 1 \ (\Psi \in H). \tag{2.163}$$

The set of effects on a Hilbert space *H* is denoted by E (*H*) or by [0,1]*B*(*H*). By Theorem A.10, we have (2.163) iff *a*∗ = *a* and the eigenvalues λ of *a* lie in the interval [0,1] (i.e., σ(*a*) ⊂ [0,1]). This implies that *a* ≤ 1, and conversely, if *a* ≥ 0, using the bound *a* ≤ *a* · 1*<sup>H</sup>* for any self-adjoint operator *a*, which easily follows from (A.47), we see that for *a* ≥ 0, the condition *a* ≤ 1 is equivalent to *a* ∈ E (*H*). In particular, it follows that both projections and density operators are effects.

Proposition 2.41. *1. The set* E (*H*) *of effects on H is a compact convex subset of B*(*H*) *in its* σ*-weak topology, with extreme boundary*

$$
\partial\_{\epsilon} \mathcal{E}(H) = \mathcal{P}(H), \tag{2.164}
$$

*i.e., the set of all projections on H (including 0). 2. Each a* ∈ E (*H*) *has a (typically non-unique) extremal decomposition*

$$a = \sum\_{i=0}^{m} t\_i f\_i,\tag{2.165}$$

*in which ti* ≥ 0 *and* ∑*<sup>i</sup> ti* = 1*, and the fi are projections.*

The σ-weak topology on *B*(*H*), defined after Corollary A.31, is the right one in this context, but if *H* is finite-dimensional, as we assume here, this technicality may be ignored, as the claim is even true with respect to the norm topology.

*Proof.* In Part 1, compactness and convexity are easily checked.

The inclusion ∂*e*E (*H*) ⊆ P(*H*) is equivalent to the claim that any *a* ∈ E (*H*), *a* ∈/ P(*H*), does not lie in ∂*e*E (*H*) and hence admits a convex decomposition

$$a = ta\_1 + (1 - t)a\_2, \ t \in (0, 1), a\_1, a\_2 \in \mathcal{E}(H), a\_1 \neq a \neq a\_2,\tag{2.166}$$

or, equivalently, *a* has a nontrivial decomposition *a* = ∑*<sup>i</sup> tiai*, for certain *ti* > 0 with ∑*<sup>i</sup> t*<sup>1</sup> = 1. Indeed, the latter follows from the spectral resolution (A.37), in which the spectral projections *e*<sup>λ</sup> should be rescaled if necessary to as to make the coefficients sum to unity (note that *te* ∈ E (*H*) for any projection *e* and any *t* ∈ [0,1]).

To show the opposite inclusion P(*H*) ⊆ ∂*e*E (*H*), again assume (2.166), where this time *a* = *e* ∈ P(*H*) is a projection. "Sandwiching" between ψ ∈ *H*1, this yields

$$
\langle \Psi, a\_1 \Psi \rangle = \langle \Psi, a\_2 \Psi \rangle = 0, \ \Psi \in (eH)^\perp; \tag{2.167}
$$

$$
\langle \Psi, a\_1 \Psi \rangle = \langle \Psi, a\_2 \Psi \rangle = 1,\ \Psi \in eH.\tag{2.168}
$$

Using 0 ≤ *ai* ≤ 1, *i* = 1,2, and (A.37), these equations imply that *a*<sup>1</sup> = *a*<sup>2</sup> = *e*.

The claim of part 2 is satisfied by picking the *ti* and *fi* in terms of the spectral data associated to *a* (cf. Theorem A.10), as follows: with *m* = |σ(*a*)|, order the eigenvalues λ ∈ σ(*a*) according to λ<sup>1</sup> < ··· < λ*m*, and take:

$$\mathfrak{a}\_0 = 1 - \mathfrak{A}\_{\mathfrak{m}};\tag{2.169}$$

$$\mathfrak{a}\_{\mathsf{l}} = \mathfrak{A}\_{\mathsf{l}};\tag{2.170}$$

$$\mathfrak{a}\_{i} = \mathfrak{A}\_{i} - \mathfrak{A}\_{i-1} \ (i \ge 2);\tag{2.171}$$

$$f\_0 = 0;\tag{2.172}$$

$$f\_{\parallel} = 1\_H;\tag{2.173}$$

$$f\_l = \sum\_{j=l}^{m} e\_{\lambda\_l} \ (i \ge 2). \tag{2.174}$$

The validity of (2.165) is then a trivial verification. -

Note that, in general, the extremal decomposition of *a as an effect* differs from its spectral resolutions (A.37) or (A.38) *as a self-adjoint operator*. If *a* = ρ is a density operator, then the latter, i.e., (2.6), does provide an extremal decomposition of *a* construed as an effect also, which differs from the one in (2.165). This example shows that extremal decompositions in E (*H*) are not necessarily unique. Also, observe that *te*, for *e* ∈ P(*H*) and *t* ∈ (0,1), does not lie in ∂*e*E (*H*), since it admits a nontrivial decomposition *te* = *te*+ (1−*t*)· 0, recalling that 0 ∈ P(*H*) ⊂ E (*H*).

Busch's Theorem classifies the following objects.

Definition 2.42. *A* probability distribution *on* E (*H*) *is a function p* : E (*H*) → [0,1] *that satisfies the following two conditions:*

*1. p*(1*H*) = 1*; 2. If a (finite) family* (*ai*) *of effects satisfies* ∑*<sup>i</sup> ai* ≤ 1*H, then*

$$p\left(\sum\_{i} a\_{i}\right) = \sum\_{i} p(a\_{i}).\tag{2.175}$$

Lemma 2.43. *if a (finite) family* (*ai*) *of effects satisfies* ∑*<sup>i</sup> ai* = 1*, then* ∑*<sup>i</sup> p*(*ai*) = 1*.*

This trivial observation implies that a probability distribution on E (*H*) induces a probability distribution on P(*H*) ⊂ E (*H*) by restriction, cf. Definition 2.23. Another way to see this from the perspective of probability measures is to note that any family (*ei*) of projections that satisfies ∑*<sup>i</sup> ei* ≤ 1 is automatically orthogonal.

Therefore, restricted to P(*H*), Definition 2.42 reduces to Definition 2.23.2. To see this, fix *j* and pick ψ ∈ *ejH*. The condition ∑*<sup>i</sup> ei* ≤ 1 gives

$$\sum\_{i \neq j} \langle \Psi, e\_i \Psi \rangle = \sum\_{i \neq j} ||e\_i \Psi||^2 \le 0,$$

but since each term is positive, this implies *ei*ψ = 0 for each *i* = *j*. Putting ψ = *e <sup>j</sup>*ϕ, where ϕ ∈ *H* is arbitrary, this gives *eie <sup>j</sup>*ϕ = 0 for all ϕ and hence *eiej* = 0.

Clearly, any state ω on *B*(*H*) induces a probability distribution *p*<sup>ω</sup> on E (*H*) by

$$p\_{\mathfrak{o}}(a) = \mathfrak{o}(a). \tag{2.176}$$

Busch's Theorem shows the converse.

Theorem 2.44. *Any probability distribution p on* E (*H*) *takes the form p* = *p*<sup>ω</sup> *for some state* ω *on B*(*H*)*, establishing a bijective correspondence between probability distributions on* E (*H*) *and states on B*(*H*)*.*

*Proof.* If *p* : E (*H*) → [0,1] can be extended to a linear map ω : *B*(*H*) → C, then ω is automatically a state, for normalization is assumed and positivity follows from the fact that any 0 <sup>=</sup> *<sup>b</sup>* <sup>≥</sup> 0 has the form *<sup>b</sup>* <sup>=</sup> *ra* for some *<sup>r</sup>* <sup>∈</sup> <sup>R</sup><sup>+</sup> and 0 <sup>≤</sup> *<sup>a</sup>* <sup>≤</sup> <sup>1</sup>*H*, namely with *r* = *b* and *a* = *b*/*b*; then *a* ≥ 0 and *a* = 1, so that, as explained earlier, *a* is an effect. Hence ω(*b*) = ω(*ra*) = *rp*(*a*) ≥ 0. To achieve this extension:

1. We show that *p*(*ra*) = *rp*(*a*) for all *r* ∈ Q ∩ [0,1] and 0 ≤ *a* ≤ 1*H*. Indeed, for any such *a* and *n* ∈ N we write *a* = (*a*+···+*a*)/*n* (*n* terms), so that by (2.175), *p*(*a*) = *np*(*a*/*n*). Similarly, for any *m* ∈ N and 0 ≤ *b* ≤ 1*H*/*m*, we have *p*(*mb*) = *mp*(*b*). Take integers *m*,*n* such that (*m*/*n*) ∈ [0,1] and put *b* = *a*/*n*, so that

$$p\left(\frac{m}{n}a\right) = mp\left(\frac{a}{n}\right) = \frac{m}{n}p(a). \tag{2.177}$$

2. We next prove that *p*(*ta*) = *t p*(*a*) for all *t* ∈ [0,1] and 0 ≤ *a* ≤ 1*H*. Positivity of *p* yields *p*(*a*) ≤ *p*(*a* ) whenever 0 ≤ *a* ≤ *a* ≤ 1*H*. Given *t* ∈ [0,1], take an increasing sequences of rationals (*rn*) with *rn* ≤ *t*, as well as a decreasing sequence of rationals (*sn*) with *t* ≤ *sn*, such that *rn* ↑ *t* and *sn* ↓ *t* in R. With step 1, this gives

$$r\_n p(a) = p(r\_n a) \le p(ta) \le p(s\_n a) \le s\_n p(a).$$

Letting *n* → ∞, this gives *t p*(*a*) ≤ *p*(*ta*) ≤ *t p*(*a*), and hence equality.


This argument also shows that ω remains linear on general self-adjoint *a* and *b*, since *a*+*b* = (*a*<sup>+</sup> +*b*+)−(*a*<sup>−</sup> +*b*−) is a decomposition with (*a*<sup>±</sup> +*b*±) ≥ 0.

6. Finally, for general *c* ∈ *B*(*H*) we (uniquely) decompose *c* = *a*+*ib*, *a*<sup>∗</sup> = *a*, *b*<sup>∗</sup> = *b*, cf. the proof of Corollary A.20, and put ω(*c*) = ω(*a*) +*i*ω(*b*). -

To close, we give a very brief and superficial introduction to effects as they arise from modern ("operational") quantum measurement theory. This theory associates quantum data to classical data through the concept of a *Positive Operator Valued Measure* or *POVM*. Relative to some given "classical" space *X* (taken finite here) and Hilbert space *H* (assumed finite-dimensional), a POVM is defined as a map

$$\mathbb{A}: \mathcal{P}(X) \to \mathbb{\mathcal{A}}(H) \tag{2.178}$$

that satisfies A(*X*) = 1*<sup>H</sup>* as well as A(*U* ∪*V*) = A(*U*) +A(*V*) whenever *U* ∩*V* = 0,/ cf. Definition 1.1. Equivalently, a POVM is a map

$$\mathbf{a}: X \to \mathcal{E}(H) \tag{2.179}$$

that satisfies

$$\sum\_{\mathbf{x}\in X} \mathbf{a}(\mathbf{x}) = 1\_H. \tag{2.180}$$

As in the classical case, these notions are trivially equivalent through

$$\mathbf{a}(x) = \mathsf{A}(\{x\});\tag{2.181}$$

$$\mathsf{A}(U) = \sum\_{\mathfrak{x} \in U} \mathsf{a}(\mathfrak{x}).\tag{2.182}$$

The motivating special case of a POVM is given by some self-adjoint operator *a* ∈ *B*(*H*), which yields *X* = σ(*a*) and a(λ) = *e*<sup>λ</sup> . In that case, each density operator ρ induces a probability distribution on σ(*a*) through the Born rule (2.8). More generally, a probability distribution *p* on E (*H*) and a POVM (2.179) jointly determine a probability distribution *p*<sup>a</sup> on *X*, given by

$$p\_{\mathfrak{a}}(\mathfrak{x}) = p(\mathfrak{a}(\mathfrak{x})).\tag{2.183}$$

Indeed, *p*a(*x*) ≥ 0 because *a* ≥ 0, and ∑*x*∈*<sup>X</sup> p*a(*x*) = 1 by (2.180) and Lemma 2.43. The idea, then, is that a measurement of some POVM a has (classical) outcome *x* with probability *p*a(*x*); this generalizes the traditional dogma that a measurement of an observable *a* has outcome λ ∈ σ(*a*) with (Born) probability (2.8). Indeed, combined with (2.33), Busch's Theorem shows that we necessarily have

$$p\_{\mathfrak{a}}(\mathfrak{x}) = \text{Tr}(\mathfrak{p}\mathfrak{a}(\mathfrak{x})),\tag{2.184}$$

for some density operator ρ. So nothing has been gained by introducing Definition 2.42, expect perhaps for the insight that, as in Gleason's Theorem, it is the noncontextuality of a probability distribution on E (*H*)—in that *p*(*a*(*x*)) is independent of the POVM a which *a*(*x*) forms part of—that eventually enforces (2.184).

#### 2.10 The quantum logic of Birkhoff and von Neumann

In §1.4 we showed that *classical* mechanics has a *classical* logical structure, in which (equivalence classes of) propositions correspond to subsets of phase space. These subsets form a Boolean lattice in which the logical connectives ¬, ∧, and ∨ fornegation, disjunction, and conjunction, respectively, are interpreted as their natural set-theoretic counterparts (i.e., complementation, intersection, and union).

In 1936, Birkhoff and von Neumann proposed a strikingly similar *quantum* logic for *quantum* mechanics, in which (closed) linear subspaces of Hilbert space play the role of (measurable) subsets of phase space, and the basic logical connectives (except implication, which is queerly lacking in this setting) are interpreted as:

$$
\neg L = L^{\bot};\tag{2.185}
$$

$$L \wedge \mathcal{M} = L \cap \mathcal{M};\tag{2.186}$$

$$L \vee \mathcal{M} = L + \mathcal{M},\tag{2.187}$$

where *L*<sup>⊥</sup> is the orthogonal complement of *L*, see (A.29), *L*∩*M* is the (set-theoretic) intersection of *L* and *M*, and *L* + *M* is the (closed) linear span of *L* and *M*. If dim(*H*) < ∞, as we continue to assume, any linear subspace of *H* is automatically closed, and the infinite-dimensional case an attractive operator-algebraic and latticetheoretic structure arises only if the events are taken to be *closed* linear subspaces.

Although the Brouwer–Hilbert debate on the foundations of mathematics had somewhat subsided in 1936, with hindsight it may be argued that the quantum logic of Birkhoff and von Neumann (who had been a "postdoc" *avant la lettre* with Hilbert) was predicated on their desire to preserve not only the *law of contradiction*

$$
\mathfrak{a} \wedge \neg \mathfrak{a} = \bot,\tag{2.188}
$$

where α is any proposition and ⊥ is the proposition that is identically false, but also, against Brouwer, the *law of excluded middle* (or *tertium non datur*)

$$
\alpha \lor \neg \alpha = \top,\tag{2.189}
$$

where # is the proposition that is identically true. Indeed, in the Birkhoff–von Neumann model (2.185) - (2.187), where ⊥ = {0} and # = *H*, these are identities. Similarly, their model satisfies the *law of double negation*

$$
\neg \neg \alpha = \alpha,\tag{2.190}
$$

which both in classical logic (where it is a tautology) and in intuitionistic logic (where it is rejected in general) is equivalent to (2.189). Also, *De Morgan's Laws*:

$$\neg(\alpha \lor \beta) = \neg \alpha \land \neg \beta;\tag{2.191}$$

$$
\neg(\alpha \land \beta) = \neg \alpha \lor \neg \beta,\tag{2.192}
$$

hold in their quantum logic (despite their origin in *classical* propositional logic).

We will now derive the Birkhoff–von Neumann structure along similar lines as its classical counterpart (cf. §1.4), except that in the absence of the necessary structure for a classical propositional calculus we now rely on semantic entailment alone.

In quantum theory, the role of functions *f* : *X* → R as observables in classical physics is played by self-adjoint operators *a* : *H* → *H* on some Hilbert space *H*, and hence the quantum analogue of an elementary proposition *f* ∈ Δ of classical physics is *a* ∈ Δ (where Δ ⊂ R), with special case *a* = λ for *a* ∈ {λ} (with λ ∈ R).

In analogy to the points *x* ∈ *X* of phase space, pure states ωψ as in (2.42), or the corresponding density operators *e*<sup>ψ</sup> (where ψ ∈ *H* is a unit vector), yield truth assignments to elementary propositions. To start with the simplest case, *a* = λ is:


The underlying idea here is arguably that, according to some naive operational interpretation of quantum mechanics, a measurement of *a* in a state ωψ would give outcome λ with probability one (zero) iff *a* = λ is true (false) with respect to ωψ. If 0 < *p* ψ *<sup>a</sup>* (λ) < 1, the "truthmaker" ωψ actually *fails to assign a truth value* to *a* = λ; the *partial* nature of truthmakers marks a significant difference with the classical case, as does the closely related distinction between *false* and *not true*. Similarly, we say that an elementary proposition *a* ∈ Δ is *true* in some state ωψ iff

$$P\_a^{\Psi}(\Delta) \equiv \left\| e\_{\Delta} \Psi \right\|^2 = 1,\tag{2.193}$$

cf. (2.9) and (A.42), and *false* if *P*<sup>ψ</sup> *<sup>a</sup>* (Δ) = 0. In other words, *a* ∈ Δ is true in ωψ iff ψ ∈ *H*<sup>Δ</sup> , and false if ψ ⊥ *H*<sup>Δ</sup> ,see (A.43). Such propositions may formally be combined using the connectives ¬, ∧, and ∨ (whose meaning is unfortunately far from clear in this new setting) according to the same (inductive) formation rules as in classical propositional logic. However, the classical truth tables for ∧ and ∨ are unsound with regard to the above rules, at least if one eventually wants to arrive at (2.185) - (2.187). For example, ωψ may validate neither α nor β, yet it might make α ∨β true (assuming that α and β correspond to *L* and *M*, respectively, this is the case if ψ ∈/ *L* and ψ ∈/ *M*, yet ψ ∈ *L*+*M*). Similarly, ωψ may render neither α nor β false, yet it may falsify α ∧β. Due to this complication, the approach of §1.4 has to be modified, as follows. Our goal remains to define a *semantic equivalence relation* ∼*H*, which is predicated on an inductive definition of truth we first give.

Definition 2.45. *1. a* <sup>∈</sup> <sup>Δ</sup> *is* true *in* ωψ *iff P*<sup>ψ</sup> *<sup>a</sup>* (Δ) = 1*, and* false *if P*<sup>ψ</sup> *<sup>a</sup>* (Δ) = 0*.*


Lemma 2.46. *Definition 2.45 implies the following rules:*


Hence conjunctions behave classically, as part 3 states that (*a* ∈ Δ)∧(*b* ∈ Γ ) is true iff *a* ∈ Δ and *b* ∈ Γ are true). The proof of this lemma uses the following notation.

Definition 2.47. *If e and f are projections on a Hilbert space H, then:*


Note that if *e* and *f* commute, these reduce to the algebraic expressions

$$e \wedge f = ef;\tag{2.194}$$

$$e \lor f = e + f - ef. \tag{2.195}$$

Furthermore, in case of potential ambiguity we will write *e* (*a*) <sup>Δ</sup> for the spectral projection *e*<sup>Δ</sup> as defined by *a*, and analogously *e* (*b*) <sup>Γ</sup> , etc. Similarly for *<sup>H</sup>*(*a*) <sup>Δ</sup> etc.

*Proof.* The first and third claims are immediate. The second one follows from the relation *e*Δ*<sup>c</sup>* = *e*<sup>⊥</sup> <sup>Δ</sup> = 1−*e*<sup>Δ</sup> , or, equivalently, *H*Δ*<sup>c</sup>* = *H*<sup>⊥</sup> <sup>Δ</sup> . For the fourth, use Definition 2.45.6, 3, and 2 to infer that (*<sup>a</sup>* <sup>∈</sup> <sup>Δ</sup>)∨(*<sup>b</sup>* <sup>∈</sup> <sup>Γ</sup> ) is true iff (*<sup>a</sup>* <sup>∈</sup> <sup>Δ</sup>*c*)∧(*<sup>b</sup>* <sup>∈</sup> <sup>Γ</sup> *<sup>c</sup>*) is false. From the third claim, we note that

$$(a \in \Delta) \land (b \in \Gamma) \sim\_H \left( e^{(a)}\_{\Delta} \land e^{(b)}\_{\Gamma} = 1 \right),\tag{2.196}$$

so by Definition 2.45.5, (*<sup>a</sup>* <sup>∈</sup> <sup>Δ</sup>*c*)∧(*<sup>b</sup>* <sup>∈</sup> <sup>Γ</sup> *<sup>c</sup>*) is false iff *<sup>e</sup>* (*a*) <sup>Δ</sup>*<sup>c</sup>* ∧*e* (*b*) <sup>Γ</sup> *<sup>c</sup>* = 1 is false. Since *e* (*a*) <sup>Δ</sup>*<sup>c</sup>* ∧*e* (*b*) <sup>Γ</sup> *<sup>c</sup>* <sup>=</sup> 1 is true iff <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*(*a*) <sup>Δ</sup>*<sup>c</sup>* <sup>∩</sup>*H*(*b*) <sup>Γ</sup> *<sup>c</sup>* , claim 2 implies *e* (*a*) <sup>Δ</sup>*<sup>c</sup>* ∧*e* (*b*) <sup>Γ</sup> *<sup>c</sup>* = 1 is false iff

$$\Psi \in (H^{(a)}\_{\Delta^c} \cap H^{(b)}\_{\Gamma^c})^\perp = ((H^{(a)}\_{\Delta})^\perp \cap (H^{(b)}\_{\Gamma})^\perp)^\perp = (H^{(a)}\_{\Delta})^{\perp \perp} + (H^{(b)}\_{\Gamma})^{\perp \perp} = H^{(a)}\_{\Delta} + H^{(b)}\_{\Gamma},$$
 which finishes the proof.

Quite analogously to the classical case, Definition 2.45 implies

$$(a \in \Delta) \vdash\_H (b \in \Gamma) \text{ iff } e^{(a)}\_{\Delta} \subseteq e^{(b)}\_{\Gamma},\tag{2.197}$$

which, once again, immediately yields (*a* ∈ Δ) ∼*<sup>H</sup>* (*b* ∈ Γ ) iff *e* (*a*) <sup>Δ</sup> = *e* (*b*) <sup>Γ</sup> . Taking *b* = *e* (*a*) <sup>Δ</sup> and Γ = {1}, analogously to (1.53), as in the above proof we have

$$a \in \Delta \sim\_H e^{(a)}\_{\Delta} = 1. \tag{2.198}$$

Furthermore, as in the proof of Lemma 2.46 we find

$$(a \in \Delta) \land (b \in \Gamma) \sim\_H \left( e^{(a)}\_{\Delta} \land e^{(b)}\_{\Gamma} = 1 \right);\tag{2.199}$$

$$(a \in \Delta) \vee (b \in \Gamma) \sim\_H \left( e^{(a)}\_{\Delta} \vee e^{(b)}\_{\Gamma} = 1 \right). \tag{2.200}$$

Consequently, we have the following counterpart of Lemma 1.19:

Lemma 2.48. *Any elementary or composite proposition is semantically equivalent (relative to H) to one of the form e* = 1*, for some projection e. Furthermore,*

$$\neg(e=1) \sim\_H \left(e^\perp = 1\right);\tag{2.201}$$

$$(e=1) \land (f=1) \sim\_H (e \land f=1);\tag{2.202}$$

$$(e=1) \vee (f=1) \sim\_H (e \vee f=1). \tag{2.203}$$

At last, the quantum version of Theorem 1.20 reads as follows:

Theorem 2.49. *The set* Q(*H*) *of equivalence classes* [·]*<sup>H</sup> of propositions generated by the elementary propositions a* ∈ Δ *and the logical connectives* ¬*,* ∨*, and* ∧*, is isomorphic to the set* L (*H*) *of linear subspaces of H, under the map*

$$\varphi \colon \mathcal{Q}(H) \xrightarrow{\cong} \mathcal{Q}^{\ell}(H); \tag{2.204}$$

$$\mathfrak{q}([a \in \Delta]\_{\mathcal{H}}) = e^{(a)}\_{\Delta} H. \tag{2.205}$$

*Under this isomorphism, the logical connectives* ¬*,* ∧ *and* ∨ *turn into orthogonal complementation* (−)⊥*, intersection* ∩*, and linear span* +*, respectively, in that*

ϕ([¬α]*H*) = ϕ([α]*<sup>X</sup>* ) ⊥; (2.206)

ϕ([α ∧β]*H*) = ϕ([α]*H*)∩ϕ([β]*H*; (2.207)

$$\mathfrak{q}([\alpha \vee \beta]\_H) = \mathfrak{q}([\alpha]\_H) + \mathfrak{q}([\beta]\_H),\tag{2.208}$$

*Furthermore, if we define a partial order* ≤ *on* Q(*X*) *by saying that* [α]*<sup>H</sup>* ≤ [β]*<sup>H</sup> iff* α |=*<sup>H</sup>* β *(which is well defined), then* ϕ *maps* ≤ *into set-theoretic inclusion* ⊆*, i.e.,*

$$[\alpha]\_{H} \le [\beta]\_{H} \text{ iff } \mathfrak{g}([\alpha]\_{H}) \subseteq \mathfrak{g}([\beta]\_{H}).\tag{2.209}$$

*With respect to these operations,* L (*H*)*is a* modular lattice *(granted that* dim(*H*) < ∞*; otherwise, the lattice is merely* orthomodular*, cf.* §*D.1 for terminology).*

*Proof.* Most of this is immediate from Lemma 2.48, expect for the last claim, which follows from simple computations (and from the Amemiya–Araki Theorem). -

As in the classical case, there is an algebraic reformulation of this result, obtained from the bijective correspondence between (closed) linear subspaces *L* of *H* and projections *e* on *H*, given by *L* = *eH* (see Proposition A.8).

Theorem 2.50. *The set* Q(*H*) *of equivalence classes* [·]*<sup>H</sup> of propositions generated by the elementary propositions a* ∈ Δ *and the logical connectives* ¬*,* ∨*, and* ∧*, is isomorphic to the set* P(*H*) *of projections on H, under the map*

$$\Phi' \colon \mathcal{Q}(H) \xrightarrow{\cong} \mathcal{P}(H);\tag{2.210}$$

$$\mathfrak{q}'([a \in \Delta]\_H) = \mathfrak{e}\_{\Delta}^{(a)},\tag{2.211}$$

*where (once again)* P(*H*) *is the set of all projections on H.*

*Under this map, the logical connectives* ¬*,* ∧ *and* ∨ *turn into (cf. Definition 2.47):*

$$\mathfrak{o}'([\neg \mathcal{a}]\_H) = 1 - \mathfrak{o}'([\mathfrak{a}]\_X) \tag{2.212}$$

$$\mathfrak{q}'([\mathfrak{a}\wedge\mathfrak{B}]\_{H}) = \mathfrak{q}'([\mathfrak{a}]\_{H})\wedge\mathfrak{q}'([\mathfrak{B}]\_{H});\tag{2.213}$$

$$\mathfrak{q}'([\mathfrak{a}\vee\mathfrak{B}]\_{H}) = \mathfrak{q}'([\mathfrak{a}]\_{H})\vee\mathfrak{q}'([\mathfrak{B}]\_{H}),\tag{2.214}$$

*Furthermore,* ϕ *maps the partial order* ≤ *on* Q(*H*) *into the partial order on* P(*H*) *defined by e* ≤ *f iff eH* ⊆ *f H, or equivalently, iff e f* = *e.*

*Finally, with respect to these operations,* P(*H*) *is an (ortho)modular lattice.*

However, unlike (1.65) - (1.68), this result is somewhat unsatisfactory in not being purely algebraic. This may partly be remedied through expressions like

$$e \wedge f = \lim\_{n \to \infty} (e \circ f)^n;\tag{2.215}$$

$$e \lor f = 1 - ((1 - e) \land (1 - f)),\tag{2.216}$$

where *e*◦ *f* = *e f* + *f e*, and the (strong) limit in (2.215) should be taken on fixed vectors ψ ∈ *H* (upon which it exists in the norm-topology of *H*). Even so, this specific limit still relies on the underlying Hilbert space, and in any case the expressions fail to be purely algebraic and look pretty artificial. Indeed, the same may be said about Definition 2.45, which, of course, has been fine-tuned with hindsight in order to obtain the "desired" answer in the form of Theorem 1.20, which in turn vindicates the mathematically sweet Birkhoff–von Neumann *Ansatz* (2.185) - (2.187).

In addition, there are serious *conceptual* objections to this kind of quantum logic:


In Chapter 12, we will therefore replace the doomed quantum logic of Birkhoff and von Neumann by the intuitionistic logic of Brouwer and Heyting.

#### Notes

All operator theory for this chapter may be found in Kadison & Ringrose (1983).

## §2.1. Quantum probability theory and the Born rule

The Born rule was first stated by Born (1926b) in the context of scattering theory, following the earlier paper (Born, 1926a) in which Born omitted the absolute value squared signs (corrected in a footnote added in proof). The application to the position operator is due to Pauli (1927), who merely spent a footnote on it. The general formulation is due to von Neumann (1932, §III), following earlier contributions by Dirac (1926b) and Jordan (1927). Both Born and Heisenberg acknowledge the profound influence of Einstein on the probabilistic formulation of quantum mechanics. However, Born and Heisenberg as well as Bohr, Dirac, Jordan, Pauli and von Neumann differed with Einstein about the fundamental nature of the Born probabilities and hence on the issue of determinism. Indeed, whereas Born and the others just listed after him believed the outcome of any individual quantum measurement to be unpredictable in principle, Einstein felt this unpredictability was just caused by the incompleteness of quantum mechanics (as he saw it). See, for example, the invaluable correspondence between Einstein and Born (2005).

Mehra & Rechenberg (2000) provide a very detailed reconstruction of the historical origin of the Born rule within the context of quantum mechanics, whereas von Plato (1994) embeds a briefer historical treatment of it into the more general setting of the emergence of modern probability theory and probabilistic thinking. For the earlier history of probability see Hacking (1975, 1990). See also Landsman (2009).

#### §2.2. Quantum observables and states

Proposition 2.10 is due to von Neumann; see also Chapter 6.

#### §2.3. Pure states in quantum mechanics

This kind of thinking goes back to von Neumann (1932) and Segal (1947ab).

#### §2.4. The GNS-construction for matrices

Again, see §C.12 for the GNS-construction in general.

#### §2.5. The Born rule from Bohrification

See notes to §4.1.

#### §2.6. The Kadison–Singer Problem

The Kadison–Singer Problem was first discussed in Kadison & Singer (1959). See the Notes to §4.3 for more information.

#### §2.7. Gleason's Theorem

#### §2.8. Proof of Gleason's Theorem

Gleason's Theorem is due to Gleason (1957), whose proof we largely follow, with some simplifications due to Varadarajan (1985) and Hamhalter (2004). Lemma 2.40.3 or some analogous result is lacking from these references; it may be found in Lyubich (1988), Chapter 4, §2, Theorem. It is often claimed that Gleason's proof has been superseded by the more elementary one due to Cooke, Keane, & Moran (1985), which avoids all use of harmonic analysis. A similar proof, following up on Cooke et al but using constructive analysis only, was given by Richman & Bridges (1999). However, both because Gleason's use of rotation invariance is very natural, and also since the proof of Cooke et al has already been presented and simplified in two monographs entirely devoted to Gleason's Theorem, viz. Dvurecenskij (1993) ˘ and Hamhalter (2004), as well as in the highly efficient book by Kalmbach (1998), we prefer to return to the original source (and add some technical details).

## §2.9. Effects and Busch's Theorem

Busch's Theorem is from Busch (2003), whose proof we follow almost *verbatim*. See also Caves et al (2004). For the use of POVM's in quantum physics see, e.g., Busch, Grabowski, & Lahti (1998), Davies (1976), Holevo (1982), Kraus (1983), Landsman (1998a, 1999), de Muynck (2002), and Schroeck (1996).

§2.10. The quantum logic of Birkhoff and von Neumann Our discussion is based on Redei (1998), with some modifications though. The original source is Birkhoff ´ & von Neumann (1936).

## Chapter 3 Classical physics on a general phase space

Passing from finite phase spaces *X* to infinite ones yields many fascinating new phenomena, some of which even seem genuinely "emergent" in not having any finitedimensional shadow, approximate or otherwise. Nonetheless, practically all results in the previous chapter remain valid, typically after the inclusion of some technical condition(s) that restrict the almost unlimited freedom allowed by infinite sets.

One of these restrictions is that in classical physics we assume that our phase space *X* is *locally compact Hausdorff*, where we recall that a space is:


This combination of topological properties turns out to be very convenient; it incorporates spaces like R*<sup>k</sup>* (and more generally all non-pathological manifolds), or lattices like Z*<sup>n</sup>* (the price is that we exclude systems with an infinite number of degrees of freedom, such as classical field theories). A locally compact Hausdorff space *X* is *regular* in that each *x* ∈ *X* and each closed set *F* ⊂ *X* not containing *x* can be separated by open sets (i.e., there are disjoint open sets *Ux x* and *UF* ⊃ *F*).

From the perspective of C\*-algebras, the main advantage of using this particular class of spaces is that they are naturally singled out by *Gelfand's Theorem*:

Theorem 3.1. *Every commutative C\*-algebra A is isomorphic to C*0(*X*) *for some locally compact Hausdorff space X, which is unique up to homeomorphism.*

A proof may be found in Appendix C; here we just explain the notation and the main idea behind the proof (cf. Definition C.1, which we do not repeat).

First, *C*0(*X*) is the set of all continuous functions *f* : *X* → C that *vanish at infinity*, i.e., for any ε > 0 the set {*x* ∈ *X* | | *f*(*x*)| ≥ ε} is compact, or, equivalently, for any ε > 0 there is a compact set *K* ⊂ *X* such that | *f*(*x*)| < ε for all *x* ∈/ *K*. For example, if *<sup>X</sup>* <sup>=</sup> <sup>R</sup>, then *<sup>f</sup>*(*x*) = exp(−*x*2) lies in *<sup>C</sup>*0(R). If *<sup>X</sup>* is compact, then *<sup>C</sup>*0(*X*) = *<sup>C</sup>*(*X*).

Second, *C*0(*X*) is a vector space under pointwise operations (including pointwise complex conjugation as the involution), and is a Banach space in the *sup-norm*

$$\|f\|\_{\simeq} = \sup\_{x \in X} \{|f(x)|\}. \tag{3.1}$$

The space *X* making *A* isomorphic to *C*0(*X*), then, is the *Gelfand spectrum* Σ(*A*) of *A*, which we already encountered (cf. Definition 1.4) as the set of nonzero algebra homomorphisms from *A* to C. This set turns out to be a locally compact Hausdorff space in the topology of pointwise convergence, and the isomorphism *A* →*C*0(*X*) is the Gelfand transform *a* → *a*ˆ, where ˆ*a*(ω) = ω(*a*). Conversely, if *X* is given, then we associate the commutative C\*-algebra *C*0(*X*) to it, as in Chapter 1.

Generalizing Definition 1.14, as a special case of the notion of a state we have:

Definition 3.2. *A* state *on C*0(*X*) *is a positive (and hence bounded) linear functional* ω : *C*0(*X*) → C *with* ω = 1*.*

If *X* is compact, given positivity one has ω = 1 iff ω(1*<sup>X</sup>* ) = 1, cf. Lemma C.4. The appropriate generalization of Theorem 1.15 then reads (cf. Corollary B.21):

Theorem 3.3. *Let X be a locally compact Hausdorff space. There is a bijective correspondence between states on C*0(*X*) *and probability measures on X, namely*

$$\mathfrak{op}(f) = \int\_X d\mu \, f, \,\, f \in \mathcal{C}\_0(X). \tag{3.2}$$

*Moreover, pure states correspond to Dirac measures and hence to points of X.*

In particular, a nonzero linear functional ω : *C*0(*X*) → C is multiplicative iff it is a pure state. This recovery of probability measures on phase space as states of the associated algebra of observables*C*0(*X*), and of points in phase space as the associated pure states, already familiar from the finite case, remains of great importance.

As in quantum mechanics, many interesting observables in classical mechanics fail to be bounded, let alone*C*0; coordinate functions (on non-compact phase spaces) and the usual kinetic energy are a case in point. This is not a serious problem, especially not if, as we shall assume from now on, *X* is a (smooth) manifold (those unfamiliar with this notion may always have *X* = R*<sup>k</sup>* in mind). In that case, there is a very natural class of (typically unbounded) functions on *<sup>X</sup>*, viz. *<sup>C</sup>*∞(*X*) <sup>≡</sup>*C*∞(*X*,R), which form a commutative algebra just like *C*0(*X*) ≡ *C*0(*X*,C), and provide the (algebraic) basis for the theory of symmetry and dynamics in classical physics, as we shall now show (the fact that functions in*C*∞(*X*) may be freely added and multiplied provides a major simplification compared to unbounded operators in quantum mechanics, even self-adjoint ones, which are most easily treated by transforming them into bounded ones, as discussed in §B.21). In fact, the most natural mathematical setting of classical physics is not operator theory, or even symplectic geometry (as even mathematically minded people used to think until the 1980s), but rather the more general and flexible framework of *Poisson geometry*, to which we now turn.

#### 3.1 Vector fields and their flows

We do not assume familiarity with differential geometry and analysis on manifolds, so in what follows one may assume that *M* = R*<sup>k</sup>* for some *k*. However, whenever possible we will phrase definitions and results in such a way that their more general meaning should be clear to those who *are* familiar with differential geometry etc.

An *old-fashioned vector field* on *X* = R*<sup>k</sup>* is a map

$$
\mathfrak{F}: \mathbb{R}^k \to \mathbb{R}^k; \tag{3.3}
$$

$$\mathfrak{F}(\mathbf{x}) = (\mathfrak{F}^1(\mathbf{x}), \dots, \mathfrak{F}^k(\mathbf{x})), \tag{3.4}$$

which describes something like a hyper-arrow at *x*. However, this is a coordinatedependent object, which is hard to generalize to arbitrary manifolds. Therefore, in a modern approach a vector field is seen as the corresponding first-order differential operator <sup>ξ</sup> : *<sup>C</sup>*∞(*X*) <sup>→</sup> *<sup>C</sup>*∞(*X*) defined by

$$\xi f(\mathbf{x}) = \sum\_{j=1}^{k} \xi^{j}(\mathbf{x}) \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}^{j}}.\tag{3.5}$$

To make the idea precise that a vector field on *X* is essentially the same as a firstorder differential operator on *C*∞(*X*), we note that it easily follows from (3.5) that

$$
\tilde{\mathfrak{L}}(fg) = \tilde{\mathfrak{L}}(f)\mathfrak{g} + f\tilde{\mathfrak{L}}(g),
\tag{3.6}
$$

for any *<sup>f</sup>*,*<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*), where the product *f g* is defined pointwise, i.e.,

$$(fg)(\mathbf{x}) = f(\mathbf{x})g(\mathbf{x}).\tag{3.7}$$

Similarly, we have pointwise addition and scalar multiplication, i.e., for *s*,*t* ∈ R,

$$(sf+tg)(x) = sf(x) + tg(x). \tag{3.8}$$

This turns *<sup>C</sup>*∞(*X*) into a commutative algebra (over <sup>R</sup>, as *<sup>C</sup>*∞(*X*) <sup>≡</sup> *<sup>C</sup>*∞(*X*,R).

A *derivation* of an algebra *A* (over R) is a linear map δ : *A* → *A* satisfying

$$
\delta(ab) = \delta(a)b + a\delta(b). \tag{3.9}
$$

Thus any vector field on *X* defines a derivation of the algebra *C*∞(*X*) by (3.5). Conversely, a deep theorem of differential geometry states that for any manifold *X*, each derivation of *C*∞(*X*) takes the form (3.5), at least locally (and for *X* = R*<sup>k</sup>* also globally). Therefore, either as a definition or as a theorem, we often simply identify vector fields on *X* with derivations of *C*∞(*X*). Derivations have a rich structure:

Definition 3.4. *A (real)* Lie algebra *is a (real) vector space equipped with a bilinear map* [·,·] : *A*×*A* → *A that satisfies* [*a*,*b*] = −[*b*,*a*] *(and hence* [*a*,*a*] = 0*) as well as*

$$[a, [b, c]] + [c, [a, b]] + [b, [c, a]] = 0 \text{ (Jacobi identity)}.\tag{3.10}$$

It is easy to see that the set Vec(*X*) of all old-fashioned vector fields ξ on *X* (i.e. in the sense (3.5)) forms a real Lie algebra under pointwise vector space operations (i.e., (*s*ξ +*t*η)(*f*) = *s*ξ *f* +*t*η *f*) and the natural bracket

$$[\xi, \eta] = \xi\eta - \eta\xi. \tag{3.11}$$

Similarly, the set Der(*A*) of all derivations on some algebra is a Lie algebra under pointwise vector space operations and Lie bracket

$$[\delta\_1, \delta\_2] = \delta\_1 \diamond \delta\_2 - \delta\_2 \diamond \delta\_1. \tag{3.12}$$

Of course, the identification of Vec(*X*) with Der(*C*∞(*X*))identifies (3.11) and (3.12).

Vector fields (or, equivalently, derivations) may be "integrated", at least *locally*, in the following sense. First, a *curve* through *x*<sup>0</sup> ∈ *X* is a smooth map *c* : *I* → *X*, where *I* ⊂ R is open and *c*(*t*0) = *x*<sup>0</sup> for some *t*<sup>0</sup> ∈ *I*. We usually assume that 0 ∈ *I* with *t*<sup>0</sup> = 0 and hence *c*(0) = *x*0. We then say that *c integrates* ξ near *x*<sup>0</sup> if

$$
\dot{c}(t) = \xi(c(t)), \tag{3.13}
$$

a somewhat symbolic equality that can be interpreted in two equivalent ways:

• Describing *<sup>c</sup>* : *<sup>I</sup>* <sup>→</sup> <sup>R</sup>*<sup>k</sup>* by *<sup>k</sup>* functions *<sup>c</sup> <sup>j</sup>* : *<sup>I</sup>* <sup>→</sup> <sup>R</sup> (*<sup>j</sup>* <sup>=</sup> <sup>1</sup>,..., *<sup>k</sup>*), eq. (3.13) denotes

$$\frac{dc^j(t)}{dt} = \xi^j(c^1(t), \dots, c^k(t)), \ j = 1, \dots, k. \tag{3.14}$$

• More abstractly, eq. (3.13) means that for any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*) we have

$$
\xi f(c(t)) = \frac{d}{dt} f(c(t)).\tag{3.15}
$$

To pass from (3.15) to (3.14), we just have to recall (3.5), and note that

$$\frac{d}{dt}f(c(t)) = \frac{d}{dt}f(c^1(t), \dots, c^k(t)) = \sum\_{j=1}^k \frac{dc^j(t)}{dt} \frac{\partial f(c(t))}{\partial x^j}.\tag{3.16}$$

The theory of ordinary differential equations shows that such local integral curves exist near any point *x*<sup>0</sup> ∈ *X*, and that they are unique in the following sense: if two curves *c*<sup>1</sup> : *I*<sup>1</sup> → *X* and *c*<sup>2</sup> : *I*<sup>2</sup> → *X* both satisfy (3.13) with *c*1(0) = *c*2(0) = *x*0, then *c*<sup>1</sup> = *c*<sup>2</sup> on *I*<sup>1</sup> ∩ *I*2. However, curves that integrate ξ near some point may not be defined for all *t*, i.e., for *I* = R. This makes the concept of a *flow* of a vector field ξ , which is meant to encapsulate all integral curves of ξ , a bit complicated. We start with the simplest case. We say that a vector field ξ is *complete* if for any *x*<sup>0</sup> ∈ *X* there is a curve *c* : R → *X* satisfying (3.13) with *c*(0) = *x*0. The simplest example of a complete vector field is *X* = R and ξ = *d*/*dx*, so that ϕ*t*(*x*) = *x* +*t*. For an incomplete example, take *X* = R and ξ (*x*) = *x*2*d*/*dx*. It can be shown that a vector field ξ with compact support (in the sense that the set {*x* ∈ *X* | ξ (*x*) = 0} is bounded) is complete. In particular, any vector field on a compact manifold is complete.

Definition 3.5. *Let X be a manifold and let* ξ ∈ Vec(*X*) *be a complete vector field. A* flow *of* ξ *is a smooth map* ϕ : R×*X* → *X, written*

$$\!\!\!\!\!\!\!\!\!\!\!\/ (\lambda) \equiv \!\!\!\!\!\!\!\/ (\mathbf{r}, \mathbf{x}), \tag{3.17}$$

*that satisfies*

$$
\mathfrak{g}\_0(x) = x;\tag{3.18}
$$

$$
\mathfrak{q}\_{\mathfrak{s}} \circ \mathfrak{q}\_{\mathfrak{l}} = \mathfrak{q}\_{\mathfrak{s}+\mathfrak{l}}, \tag{3.19}
$$

*and that integrates* ξ *is the sense that for each t* ∈ R *and x* ∈ *X,*

$$
\xi\left(\mathfrak{q}\_l(\mathfrak{x})\right) = \frac{d}{dt}\mathfrak{q}\_l(\mathfrak{x}).\tag{3.20}
$$

As before, eq. (3.20) by definition means that for each *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*) we have

$$
\xi f(\mathfrak{q}\_t(\mathfrak{x})) = \frac{d}{dt} f(\mathfrak{q}\_t(\mathfrak{x})),
\tag{3.21}
$$

or, equivalently, that in local coordinates, where

$$\mathfrak{q}\_l(\mathbf{x}) = (\mathfrak{q}\_l^1(\mathbf{x}), \dots, \mathfrak{q}\_l^k(\mathbf{x})),\tag{3.22}$$

we have

$$\frac{d\Phi\_t^j(\mathbf{x})}{dt} = \mathfrak{E}^j(\mathfrak{q}\_t(\mathbf{x})), \ j = 1, \ldots, k. \tag{3.23}$$

Indeed, the flow ϕ of ξ gives the integral curve *c* of ξ through *x*<sup>0</sup> by

$$c(t) = \mathfrak{q}\_l(\mathfrak{x}\_0). \tag{3.24}$$

According to the Picard–Lindelof Theorem in the theory of ordinary differential ¨ equations, any complete vector field has a unique flow. In fact, the uniqueness part of this theorem implies that (3.19) is a consequence of (3.20) with (3.18), but it is convenient to state (3.19) separately, so as to make the point that the flow of a complete vector field ξ on *X* is a smooth R-action on *X*, as defined by conditions (3.18) - (3.19), whose orbits integrate ξ . In particular, each ϕ*<sup>t</sup>* : *X* → *X* is invertible, with inverse <sup>ϕ</sup>−<sup>1</sup> *<sup>t</sup>* <sup>=</sup> <sup>ϕ</sup>−*t*. In particular, *<sup>X</sup>* is a disjoint union of the integral curves of ξ , which can never cross each other because of the uniqueness of the solution of the initial-value problem (3.13) with *c*(0) = *x*0).

If ξ is not complete, we do the best we can by defining the set

$$D\_{\xi} = \{(t, \mathbf{x}) \in \mathbb{R} \times X \mid \exists c : I \to X, c(\mathbf{0}) = \mathbf{x}, t \in I\} \subset \mathbb{R} \times X,\tag{3.25}$$

where it is understood that *c* satisfies (3.13). Obviously {0} × *X* ⊂ *D*<sup>ξ</sup> , and (less trivially) it turns out that *D*<sup>ξ</sup> is open. Then a flow of ξ is a map ϕ : *D*<sup>ξ</sup> → *X* that satisfies (3.18) for all *x*, eq. (3.21) for (*t*, *x*) ∈ *D*<sup>ξ</sup> , as well as (3.19) whenever defined.

#### 3.2 Poisson brackets and Hamiltonian vector fields

To obtain flows, classical mechanics requires more than a manifold structure:

Definition 3.6. *A* Poisson bracket *on a manifold X is a Lie bracket* {−,−} *on (the real vector space) C*∞(*X*)*, such that for each h* <sup>∈</sup> *<sup>C</sup>*∞(*X*) *the map*

$$
\mathfrak{S}\_h: f \mapsto \{h, f\} \tag{3.26}
$$

*is a vector field on X (or, equivalently, a derivation of C*∞(*X*,R) *with respect to its structure of a commutative algebra under pointwise multiplication). A manifold X equipped with a Poisson bracket is called a* Poisson manifold*,* (*C*∞(*X*),{, }) *is called a* Poisson algebra*, and* ξ*<sup>h</sup> is called the* Hamiltonian vector field *of h.*

Unfolding, we have a bilinear map {−,−} :*C*∞(*X*)×*C*∞(*X*) <sup>→</sup>*C*∞(*X*) that satisfies

$$\{\mathbf{g},f\} = -\{f,\mathbf{g}\};\tag{3.27}$$

$$\{f, \{\mathbf{g}, h\}\} + \{h, \{f, \mathbf{g}\}\} + \{\mathbf{g}, \{h, f\}\} = \mathbf{0};\tag{3.28}$$

$$\{f, gh\} = \{f, g\}h + \mathbf{g}\{f, h\}.\tag{3.29}$$

Bilinearity and the abstract properties (3.27) - (3.29) imply:

Proposition 3.7. *Each Poisson bracket on X defines a Lie algebra homomorphism*

$$\mathcal{C}^{\curvearrowleft}(X) \to \text{Der}(\mathcal{C}^{\curvearrowright}(X));\tag{3.30}$$

$$h \mapsto \delta\_h,\tag{3.31}$$

*or, equivalently, a Lie algebra homomorphism*

$$\mathcal{C}^{\infty}(X) \to \text{Vec}(X);\tag{3.32}$$

$$h \mapsto \mathfrak{F}\_h. \tag{3.33}$$

The time-honored example is *X* = R2*n*, with coordinates *x* = (*p*,*q*) and bracket

$$\{f, g\} = \sum\_{j=1}^{n} \left( \frac{\partial f}{\partial p\_j} \frac{\partial g}{\partial q^j} - \frac{\partial f}{\partial q^j} \frac{\partial g}{\partial p\_j} \right). \tag{3.34}$$

In that case, the Hamiltonian vector field of *h* is obviously given by

$$\mathfrak{H}\_h = \sum\_{j=1}^n \left( \frac{\partial h}{\partial p\_j} \frac{\partial}{\partial q^j} - \frac{\partial h}{\partial q^j} \frac{\partial}{\partial p\_j} \right). \tag{3.35}$$

The flow of ξ*<sup>h</sup>* gives the motion of a system with Hamiltonian *h*. Writing

$$
\mathfrak{q}\_l(p, q) = (p(t), q(t)),
$$

we see from (3.23) that this flow is given by *Hamilton's equations*

#### 3.2 Poisson brackets and Hamiltonian vector fields 89

$$\frac{dp\_j(t)}{dt} = -\frac{\partial h(p(t), q(t))}{\partial q^j};\tag{3.36}$$

$$\frac{dq^j(t)}{dt} = \frac{\partial h(p(t), q(t))}{\partial p\_j}.\tag{3.37}$$

Hamiltonians of the special form

$$h(p,q) = \frac{p^2}{2m} + V(q),\tag{3.38}$$

where *p*<sup>2</sup> = ∑*<sup>j</sup> p*<sup>2</sup> *<sup>j</sup>* , give *Newton's equation* "*<sup>F</sup>* <sup>=</sup> *ma*", where *Fj* <sup>=</sup> <sup>−</sup>∂*V*/∂*q<sup>j</sup>* , viz.

$$F\_j(q(t)) = m \frac{d^2 q^j(t)}{dt^2}.\tag{3.39}$$

Proposition 3.8. *For any vector field* ξ *on a manifold X, we say that a function <sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*) *is* conserved *if f is constant along the flow of* <sup>ξ</sup> *. If X is a Poisson manifold and* ξ = ξ*<sup>h</sup> is Hamiltonian, then f is conserved iff* {*h*, *f* } = 0*.*

The proof is trivial. A Poisson bracket on *X* may also be defined in terms of a *Poisson tensor*. In coordinates, this is just an anti-symmetric matrix *Bi j*(*x*) that satisfies

$$\sum\_{l} \left( B^{l\bar{l}} \frac{\partial B^{jk}}{\partial \mathbf{x}\_{l}} + B^{l\bar{j}} \frac{\partial B^{ki}}{\partial \mathbf{x}\_{l}} + B^{lk} \frac{\partial B^{ij}}{\partial \mathbf{x}\_{l}} \right) = \mathbf{0},\tag{3.40}$$

for each (*i*, *j*, *k*). In terms of *B*, the Poisson bracket is then defined abstractly by

$$\{f, \mathbf{g}\} = B(df, d\mathbf{g}),\tag{3.41}$$

using standard notation of differential geometry, or, in coordinates, by

$$\{f, g\}(\mathbf{x}) = \sum\_{i, j} B^{ij}(\mathbf{x}) \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}^{i}} \frac{\partial g(\mathbf{x})}{\partial \mathbf{x}^{j}}.\tag{3.42}$$

Conversely, a Poisson bracket must come from a Poisson tensor: for any derivation <sup>δ</sup> on *<sup>C</sup>*∞(*X*), the function <sup>δ</sup>(*g*) depends linearly on *dg*, so if <sup>δ</sup>*f*(*g*) = { *<sup>f</sup>*,*g*}, then δ*f*(*g*) = −δ*g*(*f*), so that { *f*,*g*} depends linearly on both *d f* and *dg*. This enforces (3.42), upon which (3.41) implies (3.40). A nice example is *X* = R3, with

$$\begin{split} \{f,g\}(\mathbf{x}) &= \mathbf{x} \left( \frac{\partial f}{\partial \mathbf{y}} \frac{\partial g}{\partial z} - \frac{\partial f}{\partial z} \frac{\partial g}{\partial \mathbf{y}} \right) + \mathbf{y} \left( \frac{\partial f}{\partial z} \frac{\partial g}{\partial \mathbf{x}} - \frac{\partial f}{\partial \mathbf{x}} \frac{\partial g}{\partial z} \right) + z \left( \frac{\partial f}{\partial \mathbf{x}} \frac{\partial g}{\partial \mathbf{y}} - \frac{\partial f}{\partial \mathbf{y}} \frac{\partial g}{\partial \mathbf{x}} \right); \\ \mathcal{B}^{ij}(\mathbf{x}) &= \sum\_{k} \mathbf{c}\_{kij} \mathbf{x}^{k}. \end{split} \tag{3.43}$$

Finally, we say that a Poisson manifold is*symplectic* if the corresponding Poisson tensor *B*(*x*) is given by an *invertible* matrix, for each *x* ∈ *X*. This requires *X* to be *even-dimensional*. For example, R2*<sup>n</sup>* with Poisson bracket (3.34) is symplectic.

#### 3.3 Symmetries of Poisson manifolds

Two equivalent notions of symmetries of classical physics suggest themselves: one is based on the idea of a Poisson *manifold* (*X*,*B*), the other comes from the equivalent notion of a Poisson *algebra* (*C*∞(*X*),{, }).

Definition 3.9. *1. A symmetry of a Poisson manifold* (*X*,*B*) *is a diffeomorphism* ϕ : *X* → *X (that is, an invertible smooth map with smooth inverse) satisfying*

$$
\varphi\_\* B = B.\tag{3.44}
$$

*2. A symmetry of a Poisson algebra* (*C*∞(*X*),{, }) *is an invertible linear map* <sup>α</sup> : *<sup>C</sup>*∞(*X*) <sup>→</sup> *<sup>C</sup>*∞(*X*) *that satisfies (for each f*,*<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*)*):*

$$
\mathfrak{a}(fg) = \mathfrak{a}(f)\mathfrak{a}(\mathfrak{g});\tag{3.45}
$$

$$\alpha(\{f,g\}) = \{\alpha(f), \alpha(g)\}.\tag{3.46}$$

Let us define the push-forward ϕ<sup>∗</sup> in (3.44). We do this in terms of the *pullback* ϕ<sup>∗</sup> of a smooth (i.e., infinitely often differentiable) map ϕ : *X* → *X*, defined as

$$\varphi^\* : \mathcal{C}^\circ(X) \to \mathcal{C}^\circ(X);\tag{3.47}$$

$$
\mathfrak{g}^\* f = f \circ \mathfrak{g}.\tag{3.48}
$$

If ϕ is a diffeomorphism, the *push-forward* ϕ<sup>∗</sup> of ϕ, which acts on derivations, is

$$\mathfrak{q}\_\* : \text{Der}(\mathcal{C}^\infty(X)) \to \text{Der}(\mathcal{C}^\infty(X));\_{.} \tag{3.49}$$

$$(\mathfrak{q}\_\*\delta)(f) = \delta(\mathfrak{q}^\*f) \circ \mathfrak{q}^{-1};\tag{3.50}$$

this may be checked to define a derivation, as follows:

$$\begin{aligned} (\boldsymbol{\varrho}\_\*\boldsymbol{\delta})(f\cdot\boldsymbol{g}) &= (\boldsymbol{\varrho}^{-1})^\*\boldsymbol{\delta}(\boldsymbol{\varrho}^\*(f\cdot\boldsymbol{g})) \\ &= (\boldsymbol{\varrho}^{-1})^\*\boldsymbol{\delta}(\boldsymbol{\varrho}^\*(f)\boldsymbol{\varrho}^\*(\boldsymbol{g})) \\ &= (\boldsymbol{\varrho}^{-1})^\*(\boldsymbol{\delta}(\boldsymbol{\varrho}^\*(f))\boldsymbol{\varrho}^\*(\boldsymbol{g}) + \boldsymbol{\varrho}^\*(f)\boldsymbol{\delta}(\boldsymbol{\varrho}^\*(\boldsymbol{g}))) \\ &= (\boldsymbol{\varrho}\_\*\boldsymbol{\delta})(f)\cdot\boldsymbol{g} + f\cdot(\boldsymbol{\varrho}\_\*\boldsymbol{\delta})(g). \end{aligned}$$

If, given coordinates *x* = (*x*1,..., *xk*) on *X*, we now (without loss of generality) take our derivation δ to be a vector field ξ = ∑*<sup>j</sup>* ξ *<sup>j</sup>* ∂/∂ *x <sup>j</sup>* , and write ϕ(*x*) = (ϕ1(*x*),...,ϕ*<sup>l</sup>* (*x*)), for the image ϕ∗(ξ ) we obtain

$$\begin{aligned} (\Phi\_\*\xi)(f)(\mathbf{x}) &= (\xi(\mathfrak{q}^\*f))(\mathfrak{q}^{-1}(\mathbf{x})) \\ &= \sum\_j \xi^j (\mathfrak{q}^{-1}(\mathbf{x})) \left( \frac{\partial}{\partial \mathbf{x}^j} f \circ \mathfrak{q} \right) (\mathfrak{q}^{-1}(\mathbf{x})) \\ &= \sum\_{j,k} \xi^k (\mathfrak{q}^{-1}(\mathbf{x})) \frac{\partial f(\mathbf{x})}{\partial \mathbf{x}^j} \frac{\partial \mathfrak{q}^j}{\partial \mathbf{x}^k} (\mathfrak{q}^{-1}(\mathbf{x})), \end{aligned}$$

#### 3.3 Symmetries of Poisson manifolds 91

so that

<sup>ϕ</sup>∗<sup>ξ</sup> *<sup>j</sup>* (*x*) = ∑ *k* ∂ ϕ *<sup>j</sup>* <sup>∂</sup> *xk* (ϕ−1(*x*))<sup>ξ</sup> *<sup>k</sup>* (ϕ−1(*x*)), (3.51)

or, equivalently,

<sup>ϕ</sup>∗<sup>ξ</sup> *<sup>j</sup>* (ϕ(*x*)) = ∑ *k* ∂ ϕ *<sup>j</sup>* <sup>∂</sup> *xk* (*x*)<sup>ξ</sup> *<sup>k</sup>* (*x*), (3.52)

which only depends on ξ (*x*), so that for each *x* ∈ *X*, ϕ<sup>∗</sup> may be localized to a linear map ϕ∗(*x*) : *TxX* → *T*ϕ(*x*)*X*. This may be done even if ϕ is not invertible. Physicists often write this as <sup>ϕ</sup>(*x*) <sup>≡</sup> *<sup>y</sup>* <sup>=</sup> *<sup>y</sup>*(*x*1,..., *<sup>x</sup>k*), <sup>ξ</sup> <sup>=</sup> *<sup>v</sup>*, <sup>ϕ</sup>∗<sup>ξ</sup> <sup>=</sup> *<sup>v</sup>* , so that we have a "covariant" transformation rule (*v* )*i* (*y*) = ∑*<sup>k</sup> j*=1 ∂ *yi* (*x*) <sup>∂</sup> *<sup>x</sup> <sup>j</sup> <sup>v</sup><sup>j</sup>* (*x*).

Taking tensor products, one obtains similar rules for higher-order tensors. For example, if *N* = *X*, the transformation rule for the Poisson tensor *B* reads

<sup>ϕ</sup>∗*Bi j*(ϕ(*x*)) = *k* ∑ *m*,*n*=1 ∂ ϕ*<sup>i</sup>* (*x*) ∂ *xm* ∂ ϕ *<sup>j</sup>* (*x*) <sup>∂</sup> *xn Bmn*(*x*), (3.53)

so that, in coordinates, the invariance requirement (3.44) reads

$$\sum\_{m,n=1}^{k} \frac{\partial \phi^i(\mathbf{x})}{\partial \mathbf{x}^m} \frac{\partial \phi^j(\mathbf{x})}{\partial \mathbf{x}^n} B^{mn}(\mathbf{x}) = B^{ij}(\boldsymbol{\varphi}(\mathbf{x})).\tag{3.54}$$

Theorem 3.10. *The two parts of Definition 3.9 are equivalent, in that:*

*1. Given a diffeomorphism* ϕ : *X* → *X satisfying* (3.44)*, the map*

$$
\mathfrak{a} = \mathfrak{p}^\*,
\tag{3.55}
$$


Here an *anti-isomorphism* of groups is just an isomorphism that inverts the order of multiplication. This complication may be removed by writing ϕ−<sup>1</sup> instead of ϕ in (3.55), but that change would make the next proposition a bit less natural.

*Proof.* The first claim is true by construction. The hard part is the second claim, which follows from a more general result about manifolds (note that in our terminology, manifolds are by definition assumed to be Hausdorff):

Proposition 3.11. *Let X and Y be a smooth manifolds. Then* (3.55) *establishes a bijective correspondence between linear maps* <sup>α</sup> :*C*∞(*X*) <sup>→</sup>*C*∞(*Y*) *satisfying* (3.45) *and smooth maps* ϕ : *Y* → *X.*

The proof is quite similar to a central part of the proof of Gelfand duality for commutative C\*-algebras, in which (3.55) establishes a bijective correspondence between C\*-homomorphisms α : *C*(*X*) → *C*(*Y*) and continuous maps ϕ : *Y* → *X*, where *X* and *Y* are compact Hausdorff spaces; see §C.3 and especially Proposition C.22.

For any commutative real algebra *A*, let Σ(*A*) be the space of non-zero algebra homomorphisms ω : *A* → R (these are just the non-zero multiplicative linear maps), equipped with the weakest topology that makes each function ˆ*a* : Σ(*A*) → R continuous, where ˆ*a*(ω) = ω(*a*). Furthermore, if *B* is another commutative real algebra, then any homomorphism α : *A* → *B* induces a continuous map α<sup>∗</sup> : Σ(*B*) → Σ(*A*) in the obvious way, that is, by <sup>α</sup>∗<sup>ω</sup> <sup>=</sup> <sup>ω</sup> ◦ <sup>α</sup>. In the special case *<sup>A</sup>* <sup>=</sup> *<sup>C</sup>*∞(*X*) (and similarly if *<sup>A</sup>* <sup>=</sup> *<sup>C</sup>*(*X*)), one has a canonical map ev*<sup>X</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>Σ</sup>(*C*(*X*)), given by ev*<sup>X</sup> <sup>x</sup>* (*f*) = *f*(*x*). The whole point (in which the entire difficulty of the proof lies) is that this map is a bijection (see Proposition C.21), which simultaneously equips *X* with a smooth structure that makes ev*<sup>X</sup>* a diffeomorphism (by definition of the smooth structure on Σ(*C*(*X*)). In view of all this, given a multiplicative linear map <sup>α</sup> : *<sup>C</sup>*∞(*X*) <sup>→</sup> *<sup>C</sup>*∞(*Y*), we obtain a continuous map <sup>ϕ</sup> : *<sup>Y</sup>* <sup>→</sup> *<sup>X</sup>* by

$$\mathfrak{q} = (\mathfrak{ev}^Y)^{-1} \circ \mathfrak{a}^\* \circ \mathfrak{ev}^X. \tag{3.56}$$

Eq. (3.55) then holds by construction. Smoothness of ϕ, then, is a consequence of the fact that <sup>α</sup>(*f*) = *<sup>f</sup>* ◦<sup>ϕ</sup> must be a smooth function on *<sup>Y</sup>* for any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*).

Applying this to the setting of Theorem 3.10 easily yields all claims. -

In what follows, we look at smooth actions of Lie groups on (Poisson) manifolds *X*, in other words, at homomorphisms ϕ : *G* → Diff(*X*) or ϕ : *G* → Diff(*X*,*B*), where *G* is a Lie group, Diff(*X*) is the group of all diffeomorphisms of a manifold, and Diff(*X*,*B*) is the group of all diffeomorphisms of a Poisson manifold preserving the Poisson structure. Foregoing the underlying differential geometry, we take a pragmatic attitude and only study *linear Lie groups*, defined as closed subgroups *G* of *GLn*(R) or *GLn*(C), with group multiplication given by matrix multiplication and hence group inverse being matrix inverse. Here one may think of *SU*(2) ⊂ *GL*2(C) or *SO*(3) <sup>⊂</sup> *GL*3(R), but also abelian Lie groups like the additive groups <sup>R</sup>*<sup>n</sup>* fall under this scope, since one may identify *<sup>a</sup>* <sup>∈</sup> <sup>R</sup>*<sup>n</sup>* with the 2*n*×2*n*-matrix

$$a \equiv \begin{pmatrix} 1 \ a \\ 0 \ 1 \end{pmatrix},\tag{3.57}$$

in which case matrix multiplication indeed reproduces addition. Similarly, the 2*n*+ 1-dimensional *Heisenberg group Hn* is the group of real (*n*+2)×(*n*+2)-matrices

$$(a,b,c) = \begin{pmatrix} 1 \ a^T \ c + \frac{1}{2} a^T b \\ 0 \ 1\_n & b \\ 0 \ 0 & 1 \end{pmatrix},\tag{3.58}$$

where *<sup>a</sup>*,*<sup>b</sup>* <sup>∈</sup> <sup>R</sup>*n*, *<sup>c</sup>* <sup>∈</sup> <sup>R</sup>, and *aT <sup>b</sup>* <sup>=</sup> *a*,*b*; this gives the multiplication rule

$$(a,b,c)\cdot(a',b',c')=(a+a',b+b',c+c'-\frac{1}{2}(\langle a,b'\rangle - \langle a',b\rangle)).\tag{3.59}$$

If *<sup>G</sup>* is a linear Lie group, its *Lie algebra* g may be defined as the vector space

$$\mathfrak{g} = \{ A \in M\_n(\mathbb{K}) \mid e^{tA} \in G \forall t \in \mathbb{R} \},\tag{3.60}$$

where K = R or C, as determined by the embedding *G* ⊂ *GLn*(R)) or *G* ⊂ *GLn*(C). Either way, g is seen as a *real* vector space, equipped with the *Lie bracket*

$$[A,B] = AB - BA.\tag{3.61}$$

This is trivially a bilinear antisymmetric map g×g <sup>→</sup> g satisfying the Jacobi identity

$$[A, [B, C]] + [C, [A, B]] + [B, [C, A]] = 0,\tag{3.62}$$

which in turn expresses the fact that for fixed *<sup>A</sup>* <sup>∈</sup> *<sup>g</sup>* the map <sup>δ</sup>*<sup>A</sup>* : <sup>g</sup> <sup>→</sup> <sup>g</sup> defined by

$$
\delta\_A(B) = [A, B] \tag{3.63}
$$

is a derivation of g with respect to its Lie bracket, i.e.,

$$\delta\_{\mathcal{A}}([B,C]) = [\delta\_{\mathcal{A}}(B),C] + [B,\delta\_{\mathcal{A}}(C)].\tag{3.64}$$

The *exponential map* exp : g <sup>→</sup> *<sup>G</sup>* is then just given by its usual power series, which for matrices is norm-convergent. Conversely, one may pass from *<sup>G</sup>* to g through

$$A = \frac{d}{dt}(e^{tA})\_{|t=0}.\tag{3.65}$$

If *<sup>G</sup>* <sup>=</sup> <sup>R</sup>*n*, we also have g <sup>=</sup> <sup>R</sup>*n*, and eq. (3.57) implies that exp is the identity map.

For example, since *SO*(3) is the subgroup of *GL*3(R) consisting of matrices *R* that satisfy *RTR* <sup>=</sup> 13, its Lie algebra so(3) consists of all matrices *<sup>a</sup>* that satisfy *aT* <sup>=</sup> <sup>−</sup>*a*. As a vector space have so(3) <sup>∼</sup><sup>=</sup> <sup>R</sup>3, which follows by choosing a basis

$$J\_1 = \begin{pmatrix} 0 \ 0 \ 0 \\ 0 \ 0 \ -1 \\ 0 \ 1 \ 0 \end{pmatrix}, J\_2 = \begin{pmatrix} 0 & 0 \ 1 \\ 0 & 0 \ 0 \\ -1 & 0 \ 0 \end{pmatrix}, J\_3 = \begin{pmatrix} 0 \ -1 \ 0 \\ 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \end{pmatrix}. \tag{3.66}$$

of the 3×3 real antisymmeric matrices. The commutators of these elements are

$$\left[J\_1, J\_2\right] = J\_3; \ \left[J\_3, J\_1\right] = J\_2; \ \left[J\_2, J\_3\right] = J\_1. \tag{3.67}$$

For the Lie algebra of the Heisenberg group we obtain <sup>h</sup>*<sup>n</sup>* <sup>=</sup> <sup>R</sup>2*n*+1, with basis

$$P\_i = \begin{pmatrix} 0 \ 0 & 0 \\ 0 \ 0 & -\mathbf{e}\_i \\ 0 \ 0 & 0 \end{pmatrix}, Q\_j = \begin{pmatrix} 0 \ \mathbf{e}\_j^T \ 0 \\ 0 \ \mathbf{0} \ 0 \\ 0 \ \mathbf{0} \ 0 \end{pmatrix}, Z = \begin{pmatrix} 0 \ 0 \ 1 \\ 0 \ 0 \ 0 \\ 0 \ 0 \ 0 \end{pmatrix},\tag{3.68}$$

where (e1,..., e*n*) is the usual basis of R*n*, satisfying commutation relations

$$[P\_l, Q\_j] = \delta\_{lj} Z; \ [P\_l, P\_j] = [\mathcal{Q}\_l, \mathcal{Q}\_j] = [P\_l, Z] = [\mathcal{Q}\_j, Z] = 0. \tag{3.69}$$

#### 3.4 The momentum map

Leaving out the Poisson structure for the moment, let *X* be a manifold, let *G* be a Lie group, and let ϕ : *G* → Diff(*X*) be a homomorphism; as already mentioned, this corresponds to a smooth action ϕ˜ : *G*×*X* → *X*, which we simply write as

$$
\gamma \cdot x \equiv \mathfrak{q}\_{\mathcal{I}}(x) \equiv \mathfrak{q}(\gamma, x) \,.
$$

In terms of the pullback ϕ∗ <sup>γ</sup> (*f*) = *f* ◦ϕγ , we then automatically have

$$
\mathfrak{op}\_{\mathcal{Y}}^\*(fg) = \mathfrak{op}\_{\mathcal{Y}}^\*(f)\mathfrak{op}\_{\mathcal{Y}}^\*(g). \tag{3.70}
$$

For each *<sup>A</sup>* <sup>∈</sup> <sup>g</sup> we then define a map <sup>δ</sup>*<sup>A</sup>* : *<sup>C</sup>*∞(*X*) <sup>→</sup> *<sup>C</sup>*∞(*X*) by

$$
\delta\_\mathcal{A} f(\mathbf{x}) = \frac{d}{dt} f(e^{-t\mathbf{A}} \cdot \mathbf{x})\_{|t=0}. \tag{3.71}
$$

This map is obviously linear. Moreover, it can be shown that δ is well behaved:

Proposition 3.12. *The map* <sup>δ</sup> : <sup>g</sup> <sup>→</sup> Der(*C*∞(*X*))*, A* → <sup>δ</sup>*<sup>A</sup> is a homomorphism of Lie algebra, i.e., each* <sup>δ</sup>*<sup>A</sup> is a derivation,* <sup>δ</sup> *is linear in A, and, for each A*,*<sup>B</sup>* <sup>∈</sup> <sup>g</sup>*,*

$$[\delta\_A, \delta\_B] = \delta\_{[A,B]}.\tag{3.72}$$

The proof relies on *Hadamard's Lemma*, which we only need for complete vector fields, or, equivalently, for derivations with complete flow (i.e., defined for all *t*).

Lemma 3.13. *If* <sup>δ</sup> *is a derivation of C*∞(*X*) *with complete flow* <sup>ϕ</sup>*, and f* <sup>∈</sup> *<sup>C</sup>*∞(*X*)*, then there is a function g*(*t*, *x*) ≡ *gt*(*x*) *such that for all x and t,*

$$\mathfrak{g}\_0(\mathfrak{x}) = \delta f(\mathfrak{x});\tag{3.73}$$

$$f(\mathfrak{g}\_l(\mathfrak{x})) = f(\mathfrak{x}) + t\mathfrak{g}\_l(\mathfrak{x}).\tag{3.74}$$

Indeed, if the flow is complete one may take

$$\mathbf{g}\_l(\mathbf{x}) = \int\_0^1 ds \dot{F}(\mathbf{st}, \mathbf{x}),\tag{3.75}$$

where *F*(*t*, *x*) = *f*(ϕ*t*(*x*)) and (in Newton's notation) *F*˙ is the time derivative of *F*.

*Proof.* To prove that δ*<sup>A</sup>* is linear in *A*, let ϕ be the flow of δ*A*, i.e., ϕ*t*(*x*) = *e*−*tAx*. For *<sup>B</sup>* <sup>∈</sup> <sup>g</sup>, Hadamard's Lemma with <sup>δ</sup> <sup>δ</sup>*<sup>A</sup>* and *<sup>x</sup> <sup>e</sup>*−*tBx* then gives us

$$f(e^{-tA}e^{-tB}\mathbf{x}) = f(\mathfrak{gl}(e^{-tB}\mathbf{x})) = f(e^{-tB}\mathbf{x}) + t\mathfrak{g}\_I(e^{-tB}\mathbf{x});$$

$$\Rightarrow \frac{d}{dt}f(e^{-tA}e^{-tB}\mathbf{x})|\_{t=0} = \mathfrak{G}\_{\mathfrak{B}}f(\mathbf{x}) + \mathfrak{g}\_0(\mathbf{x}) = \mathfrak{G}\_{\mathfrak{B}}f(\mathbf{x}) + \mathfrak{G}\_{\mathbf{A}}f(\mathbf{x}).\tag{3.76}$$

On the other hand, since *A* and *B* are matrices, we may use the CBH-formula

3.4 The momentum map 95

$$e^{-tA}e^{-tB} = e^{-t(A+B) + \frac{1}{2}t^2[A,B] + O(t^3)},\tag{3.77}$$

which gives *e*−*tAe*−*tB* = *e*−*t*(*A*+*B*) (1+*O*(*t* <sup>2</sup>)), and hence

$$\frac{d}{dt}f(e^{-tA}e^{-tB}\mathbf{x})\_{|t=0} = \frac{d}{dt}f(e^{-t(A+B)}\mathbf{x})\_{|t=0} = \delta\_{A+B}f(\mathbf{x}).\tag{3.78}$$

Comparing (3.76) with (3.78) gives δ*A*+*<sup>B</sup>* = δ*<sup>A</sup>* +δ*B*. The property δ*sA* = *s*δ*<sup>A</sup>* is trivial. We now prove (3.72). Within the (matrix) Lie algebra g we have

$$\langle A, B \rangle = -\frac{d}{dt}(e^{-tA}Be^{tA})\_{|t=0} = -\lim\_{t \to 0} \frac{e^{-tA}Be^{tA} - B}{t}. \tag{3.79}$$

Furthermore, for any *<sup>g</sup>* <sup>∈</sup> *<sup>G</sup>* one has *<sup>e</sup>gBg*−<sup>1</sup> = *geBg*−1, so linearity of δ gives

$$\begin{split} \delta\_{[A,B]}f(\mathbf{x}) &= -\lim\_{t\to 0} \frac{1}{t} \left( \mathcal{S}\_{e^{-tA}Be^{tA}}f(\mathbf{x}) - \mathcal{S}\_{\mathbf{B}}f(\mathbf{x}) \right) \\ &= \lim\_{t\to 0} \frac{1}{t} \left( \frac{d}{ds} f(e^{-tA}e^{sB}e^{tA}\mathbf{x}) - \frac{d}{ds} f(e^{sB}\mathbf{x}) \right) \\ &= \lim\_{s,t\to 0} \frac{1}{st} \left( f(e^{-tA}e^{sB}e^{tA}\mathbf{x}) - f(e^{-tA}e^{tA}e^{sB}\mathbf{x}) \right) \\ &= \lim\_{s,t\to 0} \frac{1}{st} \left( f \circ \otimes\_{l} (e^{sB}e^{tA}\mathbf{x}) - f \circ \otimes\_{l} (e^{tA}e^{sB}\mathbf{x}) \right) \\ &= \lim\_{s,t\to 0} \left( \frac{1}{st} \left( f(e^{sB}e^{tA}\mathbf{x}) - f(e^{tA}e^{sB}\mathbf{x}) \right) + \frac{1}{s} \left( g\_{l}(e^{sB}e^{tA}\mathbf{x}) - g\_{l}(e^{tA}e^{sB}\mathbf{x}) \right) \right) \\ &= [\delta\_{l}, \delta\_{\mathbf{B}}]f(\mathbf{x}), \end{split}$$

since in the limit *t* → 0 the third term in the penultimate line cancels the fourth. -

Now suppose that, in addition, *X* is a Poisson manifold, and that each ϕγ acts on *X* as a Poisson symmetry, in that

ϕ∗ <sup>γ</sup> *B* = *B*, (3.80)

cf. (3.44), or, equivalently, cf. (3.46),

$$\mathfrak{op}\_{\mathcal{T}}^\*(\{f,\mathfrak{g}\}) = \{\mathfrak{op}\_{\mathcal{T}}^\*(f), \mathfrak{op}\_{\mathcal{T}}^\*(\mathfrak{g})\}.\tag{3.81}$$

This implies, for each *<sup>A</sup>* <sup>∈</sup> <sup>g</sup>, and each *<sup>f</sup>*,*<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*),

$$\delta\_{\mathcal{A}}(\{f,\mathcal{g}\}) = \{\delta\_{\mathcal{A}}(f), \mathcal{g}\} + \{f, \delta\_{\mathcal{A}}(\mathcal{g})\}.\tag{3.82}$$

Compare this with the following property δ*<sup>A</sup>* already has since it is a derivation:

$$
\delta\_\mathcal{A}(f\mathbf{g}) = \delta\_\mathcal{A}(f)\mathbf{g} + f\delta\_\mathcal{A}(\mathbf{g}).\tag{3.83}
$$

We may call a derivation <sup>δ</sup> : *<sup>C</sup>*∞(*X*) <sup>→</sup> *<sup>C</sup>*∞(*X*) satisfying the like of (3.82), i.e.,

$$\delta(\{f,\mathbf{g}\}) = \{\delta(f),\mathbf{g}\} + \{f,\delta(\mathbf{g})\},\tag{3.84}$$

a *Poisson derivation*. We are already familiar with a large class of Poisson derivations: for each *<sup>h</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*), the corresponding map <sup>δ</sup>*<sup>h</sup>* defined by (3.26) is a Poisson derivation (this follows from the Jacobi identity). Let us call a Poisson derivation of the kind δ*<sup>h</sup> inner*. This raises the question if our derivations δ*<sup>A</sup>* are inner.

Definition 3.14. *A* momentum map *for a Lie group G acting on a Poisson manifold X is a map*

$$J: X \to \mathfrak{g}^\* \tag{3.85}$$

*such that for each A* <sup>∈</sup> g*,*

$$
\delta\_{\mathbf{A}} = \delta\_{I\_{\mathbf{A}}},\tag{3.86}
$$

*where the function JA* <sup>∈</sup> *<sup>C</sup>*∞(*X*) *is defined by by*

$$J\_A(\mathbf{x}) = \langle J(\mathbf{x}), A \rangle \equiv J(\mathbf{x})(A). \tag{3.87}$$

*In other words, for each A* <sup>∈</sup> <sup>g</sup> *and f* <sup>∈</sup> *<sup>C</sup>*∞(*X*) *we must have*

$$
\delta\_\mathcal{A}(f) = \{J\_\mathcal{A}, f\}. \tag{3.88}
$$

*A Lie group action admitting a momentum map is called* Hamiltonian*.*

Equivalently, a momentum map is a linear map

$$J^\*: \mathfrak{g} \to \mathcal{C}^\alpha(X) \tag{3.89}$$

such that δ*<sup>A</sup>* = δ*J*∗(*A*); the connection between the two definitions is given by

$$J\_A = J^\*(A). \tag{3.90}$$

The pullback notation *<sup>J</sup>*<sup>∗</sup> would suggest that it is a map *<sup>C</sup>*∞(g∗) <sup>→</sup> *<sup>C</sup>*∞(*X*), which is not quite the case, but it is a near miss: we embed g <sup>→</sup> *<sup>C</sup>*∞(g∗) by *<sup>A</sup>* → *<sup>A</sup>*ˆ, where *<sup>A</sup>*ˆ(θ) = <sup>θ</sup>(*A*), so *<sup>J</sup>*<sup>∗</sup> : <sup>g</sup> <sup>→</sup> *<sup>C</sup>*∞(*X*) is the restriction of the pullback *<sup>J</sup>*<sup>∗</sup> to <sup>g</sup>. Another near miss would be to read *<sup>J</sup>*<sup>∗</sup> as the adjoint to *<sup>J</sup>*, which maps <sup>g</sup>∗∗ <sup>∼</sup><sup>=</sup> <sup>g</sup> to the 'dual' *X*∗, but since *X* may not be a vector space, this dual cannot be defined as in linear algebra, so instead of all linear maps from *X* to R we might as well say that it consists of all smooth functions on *X*. Either way, the symbol *J*∗ seems justified.

Proposition 3.15. *Let G be a connected Lie group that acts on a Poisson manifold X. If this action is Hamiltonian (i.e., if it has a momentum map), then G acts on* (*X*,*B*) *by Poisson symmetries (in the sense that* (3.81) *holds).*

*Proof.* An easy computation shows that (3.82) holds. We omit the proof of the fact that for *connected* Lie groups this "infinitesimal" property is equivalent to (3.81); this relies on the fact that *G* is generated by the image of the exponential map. -

The converse is not true: if *G* acts by Poisson symmetries, the action is not necessarily Hamiltonian. For example, take *X* = R2, with the unusual Poisson bracket

$$\{f, g\}(p, q) = p \left( \frac{\partial f}{\partial p} \frac{\partial g}{\partial q} - \frac{\partial f}{\partial q} \frac{\partial g}{\partial p} \right), \tag{3.91}$$

and let *<sup>G</sup>* <sup>=</sup> <sup>R</sup> act on <sup>R</sup><sup>2</sup> by *<sup>b</sup>* ·(*p*,*q*)=(*p*,*<sup>q</sup>* <sup>+</sup> *<sup>b</sup>*). This action satisfies (3.81), and has a single generator <sup>δ</sup> <sup>=</sup> <sup>−</sup>∂/∂*q*. But there clearly is no function *<sup>J</sup>* <sup>∈</sup>*C*∞(R2) such that {*J*, *f* } = −∂ *f* /∂*q* (it should be *J*(*p*,*q*) = −log(*p*), which is singular at *p* = 0).

However, in most "everyday situations" momentum maps exist:

	- a. Let *G* = R<sup>6</sup> act on *X* by

$$(\mathbf{a}, \mathbf{b}) \cdot (\mathbf{p}, \mathbf{q}) = (\mathbf{p} + \mathbf{a}, \mathbf{q} + \mathbf{b}). \tag{3.92}$$

This action is Hamiltonian, with momentum map

$$J(\mathbf{p}, \mathbf{q}) = (\mathbf{q}, -\mathbf{p}). \tag{3.93}$$

b. Let *G* = *SO*(3) act on the same space *X* by

$$R \cdot (\mathbf{p}, \mathbf{q}) = (R\mathbf{p}, R\mathbf{q}).\tag{3.94}$$

Also this action is Hamiltonian, with momentum map

$$J(\mathbf{p}, \mathbf{q}) = \mathbf{p} \times \mathbf{q}. \tag{3.95}$$

2. Let *G* = *SO*(3) act on *X* = R3, equipped with the Poisson bracket (3.43), through its defining representation. This action has a momentum map

$$J(\mathbf{x}) = \mathbf{x},\tag{3.96}$$

where we have identified <sup>g</sup> with <sup>R</sup><sup>3</sup> by choosing the basis (3.66) of g, and have identified g<sup>∗</sup> with <sup>g</sup> (and hence with <sup>R</sup><sup>3</sup> also) by the usual inner product on <sup>R</sup>3.

3. The previous example is a special case of the *Lie–Poisson structure*. Let *G* be a Lie group with Lie algebra g. Choose a basis (*Ta*) of <sup>g</sup>, with associated *structure constants C<sup>c</sup> ab* defined by the Lie bracket on <sup>g</sup> as

$$[T\_a, T\_b] = \sum\_c C\_{ab}^c T\_c.\tag{3.97}$$

We write <sup>θ</sup> in the dual vector space <sup>g</sup><sup>∗</sup> as <sup>θ</sup> <sup>=</sup> <sup>∑</sup>*<sup>a</sup>* <sup>θ</sup>*a*ω*a*, where (ω*a*) is the dual basis to a chosen basis (*Ta*) of <sup>g</sup>, i.e., <sup>ω</sup>*a*(*Tb*) = <sup>δ</sup>*ab*. In terms of these coordinates, the *Lie–Poisson bracket* on *<sup>C</sup>*∞(g∗) is defined by

$$\{f, g\}(\theta) = \mathcal{C}\_{ab}^{c} \theta\_c \frac{\partial f(\theta)}{\partial \theta\_a} \frac{\partial g(\theta)}{\partial \theta\_b}. \tag{3.98}$$

Equivalently, the Poisson bracket (3.98) may be defined by the condition

98 3 Classical physics on a general phase space

$$\{\hat{A},\hat{B}\} = \bar{[A,\hat{B}]},\tag{3.99}$$

where *<sup>A</sup>*,*<sup>B</sup>* <sup>∈</sup> <sup>g</sup> and *<sup>A</sup>*<sup>ˆ</sup> <sup>∈</sup> *<sup>C</sup>*∞(g∗) is the evaluation map *<sup>A</sup>*ˆ(θ) = <sup>θ</sup>(*A*). Now *<sup>G</sup>* canonically acts on g<sup>∗</sup> through the *coadjoint representation*, defined by

$$(\boldsymbol{\mathfrak{x}} \cdot \boldsymbol{\mathfrak{e}})(\boldsymbol{A}) = \boldsymbol{\mathfrak{e}}(\boldsymbol{\mathfrak{x}}^{-1}\boldsymbol{A}\boldsymbol{\mathfrak{x}}).\tag{3.100}$$

This action is Hamiltonian with respect to the Lie–Poisson bracket (3.98), the associated momentum map simply being the identity map <sup>g</sup><sup>∗</sup> <sup>→</sup> g∗, as in (3.96). In other words, we have

$$J\_A = \hat{A},\tag{3.101}$$

whose correctness may be verified from the computation

$$\begin{aligned} \delta\_A \mathcal{B}(\boldsymbol{\theta}) &= \frac{d}{dt} \mathcal{B}(e^{-tA} \cdot \boldsymbol{\theta})\_{|t=0} = \frac{d}{dt} \boldsymbol{\Theta}(e^{tA} B e^{-tA})\_{|t=0} \\ &= \boldsymbol{\Theta}([A, \boldsymbol{\mathcal{B}}]) = \widehat{[A, \boldsymbol{\mathcal{B}}]}(\boldsymbol{\theta}) = \{\hat{A}, \hat{\boldsymbol{\mathcal{B}}}\}(\boldsymbol{\theta}) \\ &= \{J\_A, \hat{\boldsymbol{\mathcal{B}}}\}(\boldsymbol{\theta}). \end{aligned}$$

4. Let *X* = *T*∗*Q* for some manifold *Q*. e.g. *Q* = R*<sup>n</sup>* and hence *X* = R2*n*. We take

$$G = \text{Diff}(\mathcal{Q}),\tag{3.102}$$

i.e., the diffeomorphism group of *Q*. This is an infinite-dimensional Lie group (if described in the right way). The defining action of ϕ ∈ *G* on *Q* induces an action called ϕ∗ on *T*∗*Q*, given (in coordinates) by

$$\boldsymbol{\Phi}^\*(\boldsymbol{p}, \boldsymbol{q}) = (\boldsymbol{p}', \boldsymbol{q}');\tag{3.103}$$

$$(q^i)' = \mathfrak{q}^i(q);\tag{3.104}$$

$$p'\_i = \sum\_{j=1}^n \frac{\partial (\boldsymbol{\varrho}^{-1})^j(q)}{\partial q^i} p\_j. \tag{3.105}$$

This may be taken as a definition, but in the language of differential geometry this comes down to the neater prescription that if <sup>θ</sup> <sup>=</sup> <sup>∑</sup>*<sup>j</sup> pjdq<sup>j</sup>* <sup>∈</sup> *<sup>T</sup>*<sup>∗</sup> *<sup>q</sup> Q*, then ϕ∗θ ∈ *T*<sup>∗</sup> ϕ(*q*) *<sup>Q</sup>* is the one-form that maps a vector *<sup>X</sup>* <sup>∈</sup> *<sup>T</sup>*ϕ(*q*)*<sup>Q</sup>* to <sup>θ</sup>(ϕ−<sup>1</sup> <sup>∗</sup> (*X*)), i.e.,

$$(\mathfrak{g}^\*\theta)(X) = \theta(\mathfrak{g}\_\*^{-1}(X)),\tag{3.106}$$

where <sup>ϕ</sup>−<sup>1</sup> <sup>∗</sup> (*X*) = <sup>∑</sup>*<sup>j</sup>* <sup>ϕ</sup>−<sup>1</sup> <sup>∗</sup> (*X*)*<sup>j</sup>* ∂/∂*q<sup>j</sup>* is given componentwise by, cf. (3.52),

ϕ−<sup>1</sup> <sup>∗</sup> *<sup>X</sup> <sup>j</sup>* <sup>=</sup> ∑ *j* ∂ (ϕ−1)*<sup>j</sup>* (*q*) <sup>∂</sup>*qk <sup>X</sup><sup>k</sup>* . (3.107)

If *<sup>Q</sup>* <sup>=</sup> <sup>R</sup><sup>3</sup> and <sup>ϕ</sup> <sup>=</sup> *<sup>R</sup>* <sup>∈</sup> *SO*(3), then, using *<sup>R</sup>*−<sup>1</sup> <sup>=</sup> *RT* , we find that (3.104) - (3.105) simply become *R*∗(p,q)=(*R*p,*R*q), as in (3.94).

Furthermore, if ϕ(q) = q + b, then the partial derivatives in (3.105) form the identity matrix, so that ϕ∗(p,q)=(p,q+b). To show that the action of Diff(*Q*) on *T*∗*Q* is Hamiltonian and compute its momentum map, we need to know that the Lie algebra of Diff(*Q*) is the space Vec(*X*) of all vector fields on *Q*, with its canonical Lie bracket (3.61)! We will not prove this, but the exponential map exp : g <sup>→</sup> *<sup>G</sup>* is given through the flow <sup>ϕ</sup> of the vector field <sup>ξ</sup> on *<sup>Q</sup>* by (cf. (3.20))

$$e^{l\frac{\pi}{\xi}} = \mathfrak{q}\_l.\tag{3.108}$$

Theorem 3.16. *The action of* Diff(*Q*) *on T*∗*Q has momentum map*

$$J\_X(p,q) = -\sum\_j p\_j X^j(q),\tag{3.109}$$

*and hence is Hamiltonian. Moreover, this momentum map satisfies*

$$\{J\_{\xi}, J\_{\eta}\}\_{\xi} = -J\_{[\xi, \eta]}.\tag{3.110}$$

*Proof.* First note that <sup>ϕ</sup>−<sup>1</sup> *<sup>t</sup>* <sup>=</sup> <sup>ϕ</sup>−*t*, so from (3.71), (3.108), and (3.104) - (3.105),

$$\begin{split} \delta\_{\xi} f(p,q) &= \frac{d}{dt} f(\Phi^{\*}\_{-t}(p,q))\_{|t=0} \\ &= \sum\_{i,j} \frac{\partial f}{\partial p\_{i}}(p,q) \frac{d}{dt} \left(\frac{\partial \Phi^{j}\_{t}(q)}{\partial q^{i}}\right)\_{|t=0} p\_{j} + \sum\_{i} \frac{\partial f}{\partial q^{i}}(p,q) \frac{d}{dt} \Phi^{i}\_{-t}(q)\_{|t=0} \\ &= \sum\_{i,j} p\_{j} \frac{\partial X^{j}(q)}{\partial q^{i}} \frac{\partial f}{\partial p\_{i}}(p,q) - \sum\_{j} X^{j}(q) \frac{\partial f}{\partial q^{j}}(p,q). \end{split}$$

From this and (3.109), using the canonical Poisson bracket (3.34) we find

$$\{J\_{\xi},f\} = \delta\_{\xi}f.$$

Finally, verifying (3.110) is a simple exercise. -

Thus the momentum map is a generalization of (minus) the momentum, whence its name; the quantity in (3.95) is (minus) the angular momentum. These annoying minus signs could be removed by putting a minus sign in (3.86), but that would have other negative (*sic*) consequences. For example, with our sign choice one often has

$$\{J\_A, J\_B\} = J\_{[A,B]},\tag{3.111}$$

in which case the accompanying map (3.89) is a homomorphism of Lie algebras, or, equivalently, *J* is a morphism with respect to the given Poisson bracket on *X* and the Lie–Poisson bracket on <sup>g</sup>∗. Such a momentum map is called *infinitesimally equivariant*, for if *G* is connected, (3.111) is equivalent to the equivariance property

$$J(\mathbf{g} \cdot \mathbf{x}) = \mathbf{g} \cdot J(\mathbf{x}). \tag{3.112}$$

.

Here the *<sup>G</sup>*-action on g<sup>∗</sup> on the right-hand side is the coadjoint representation.

All of this is true for our examples (3.95), (3.96), (3.101), and (3.109); in the latter case we note that the Lie bracket in the Lie algebra of Diff(*Q*) is *minus* the commutator of vector fields. However, (3.111) does not always hold (in which case *a fortiori* also (3.112) fails). For example, it fails for (3.93): if we take the usual basis (e,f) <sup>≡</sup> (*e*1, *<sup>e</sup>*2, *<sup>e</sup>*3, *<sup>f</sup>*1, *<sup>f</sup>*2, *<sup>f</sup>*3) of <sup>g</sup> <sup>=</sup> <sup>R</sup><sup>6</sup> and relabel *<sup>e</sup> <sup>j</sup>* <sup>≡</sup> *Qj* and *fi* ≡ −*Pi*, then

$$J\_{P\_l}(\mathbf{p}, \mathbf{q}) = p\_l;\tag{3.113}$$

$$J\_{\mathcal{Q}\_f}(\mathbf{p}, \mathbf{q}) = q\_f,\tag{3.114}$$

cf. (3.93), and hence, although [*Pi*,*Pj*]=[*Qi*,*Qj*]=[*Pi*,*Qj*] = 0, we obtain

$$\{J\_{\mathcal{P}\_l}, J\_{\mathcal{P}\_j}\} = \{J\_{\mathcal{Q}\_l}, J\_{\mathcal{Q}\_j}\} = 0;\tag{3.115}$$

$$\{J\_{\mathbb{P}}, J\_{\mathbb{Q}\_{\rangle}}\} = \delta\_{ij}\mathbf{1}\_{\mathbb{R}^6}.\tag{3.116}$$

Fortunately, in cases like that one can often find a central extension *G*<sup>ϕ</sup> of *G* (see §5.10 below for notation) that acts on *X* through its quotient group *G* and does have an infinitesimally equivariant momentum map. In the case at hand, the Heisenberg group *H*<sup>3</sup> does the job, whose central elements (0,0, *c*) then act trivially on R6. In terms of the generators (3.68) we take *JPi* and *JQj* as in (3.113) - (3.114), and add *JZ* = 1R<sup>6</sup> ; according to (3.69) and (3.115) - (3.116) we then have (3.111), as desired.

Finally, the above formalism leads to a clean formulation of *Noether's Theorem*, providing the well-known link between symmetries and conserved quantities:

Theorem 3.17. *Let X be a Poisson action equipped with a Hamiltonian action of some Lie group G (so that there is a momentum map J* : *<sup>X</sup>* <sup>→</sup> g∗*). Suppose h* <sup>∈</sup>*C*∞(*X*) *is G-invariant, in that h*(<sup>γ</sup> · *<sup>x</sup>*) = *<sup>h</sup>*(*x*) *for each* <sup>γ</sup> <sup>∈</sup> *G and x* <sup>∈</sup> *X. Then for each A* <sup>∈</sup> g*, the function JA is constant along the flow of the vector field Xh. In other words,*

$$J\_A(\mathfrak{q}\_l(\mathfrak{x})) = J\_A(\mathfrak{x})\tag{3.117}$$

*for any x* ∈ *X and any t* ∈ R *for which the flow* ϕ*t*(*x*) *of Xh is defined.*

*Proof.* Using all assumptions as well as the definition of a flow, we compute:

$$\frac{d}{dt}J\_A(\mathfrak{q}\_l(\mathbf{x})) = X\_h(J\_A)(\mathfrak{q}\_l(\mathbf{x})) = \delta\_h(J\_A)(\mathfrak{q}\_l(\mathbf{x})) $$

$$= \{h, J\_A\}(\mathfrak{q}\_l(\mathbf{x})) = -\{J\_A, h\}(\mathfrak{q}\_l(\mathbf{x})) $$

$$= -\delta\_A(h)(\mathfrak{q}\_l(\mathbf{x})) = \frac{d}{ds}h(e^{sA}\mathfrak{q}\_l(\mathbf{x}))\_{|s=0} $$

$$= \frac{d}{ds}h(\mathfrak{q}\_l(\mathbf{x}))\_{|s=0} = 0. \tag{7} $$

For example, a Hamiltonian (3.38) has conserved (angular) momentum if the potential *V* is translation (rotation) invariant, reflecting (3.93) and (3.95), respectively.

#### Notes

The traditional symplectic approach to classical mechanics, culminating in the momentum map, is exhaustively covered in Guillemin & Sternberg (1984) and Abraham & Marsden (1985). A founding paper for Poisson geometry is Weinstein (1983). The modern Poisson approach to mechanics may be found in Marsden & Ratiu (1994), from which most of the material in this chapter originates.

Our proof of Proposition 3.11 is based on Navarro Gonzalez & Sancho de Salas ´ (2003), §2.1. Burtscher (2009) is a nice survey of many similar results.

## Chapter 4 Quantum physics on a general Hilbert space

In this chapter we generalize the results of Chapter 2 to infinite-dimensional Hilbert spaces. So let *H* be a Hilbert space and let *B*(*H*) be the set of all *bounded* operators on *H*. Here a notable point is that linear operators on *finite-dimensional* Hilbert spaces are automatically bounded, whereas in general they are not. Thus we impose boundedness as an extra requirement, beyond linearity. This is very convenient, because as in the finite-dimensional case, *B*(*H*) is a C\*-algebra, cf. §C.1. At the same time, assuming boundedness involves no loss of generality whatsoever, since we can alway replace closed unbounded operators by bounded ones through the *bounded transform*, as explained in §B.21. Nonetheless, even the relatively easy setting of bounded operators leads to some technical complications we have to deal with. First, Definition 2.1 must be adjusted as follows:

#### Definition 4.1. *Let H be a Hilbert space.*


As shown in Corollary B.88, if *H* is finite-dimensional this notion of a spectrum reduces to the set of eigenvalues of *a*. Even *H* is infinite-dimensional, the spectrum of a self-adjoint operator *a* is real (i.e., σ(*a*) ⊂ R); this is also true if *a* is unbounded (see Theorem B.93). For any *H*, unit vectors ψ still define special density matrices *e*ψ, as in (2.7); we will later see that these are pure states on *B*(*H*), although the set of pure states is no longer exhausted by such density matrices. Finally, quantum events *in H* still bijectively correspond with projections *on H*; see Proposition B.76. The Born rule as well as the correspondence between density matrices and states require a separate discussion, to which we now turn.

#### 4.1 The Born rule from Bohrification (II)

In this section we extend the characterization of the Born rule in §2.5, which was restricted to finite phase spaces *X* and finite-dimensional Hilbert spaces *H*, to the general case. Recall that a *probability space* is a measure space (*X*,Σ,μ) for which μ(*X*) = 1, and that, for compact *X*, a state on *C*(*X*) is a positive map ϕ : *C*(*X*) → C that is positive and satisfies ϕ(1*<sup>X</sup>* ) = 1. Theorem B.15 and Corollary (B.17) yield:

Theorem 4.2. *Let X be a compact Hausdorff space. There is a bijective correspondence between probability measures* μ *on X and states* ω *on C*(*X*)*, given by*

$$\mathfrak{w}(f) = \int\_X d\mu \, f, \, f \in \mathcal{C}(X). \tag{4.1}$$

More precisely, the correspondence in question is between complete regular probability spaces (*X*,Σ,μ) and states on *C*(*X*), and this is understood in what follows.

Second, we recall that if *H* is a Hilbert space and *a* ∈ *B*(*H*), then *C*∗(*a*) is the C\*-algebra generated by *a* and 1*<sup>H</sup>* (i.e., the norm-closure of the algebra of all polynomials in *a*). Theorems B.84, B.94, and B.93 give the following spectral theorem:

Theorem 4.3. *If a*<sup>∗</sup> = *a* ∈ *B*(*H*)*, then C*∗(*a*) *is commutative,* σ(*a*) ⊂ R *is compact, and there is an isomorphism of (commutative) C\*-algebras*

$$\mathcal{C}(\sigma(a)) \cong \mathcal{C}^\*(a),\tag{4.2}$$

*written f* → *f*(*a*)*, which is unique if it is subject to the following conditions:*


*Furthermore, this* continuous functional calculus *satisfies the rules*

$$(tf+g)(a) = tf(a) + g(a);\tag{4.3}$$

$$(fg)(a) = f(a)g(a);\tag{4.4}$$

$$f(a)^{\*} = f^{\*}(a). \tag{4.5}$$

Combining Theorems 4.2 and 4.3 gives a result of great importance:

Corollary 4.4. *Let H be a Hilbert space, let a*<sup>∗</sup> = *a* ∈ *B*(*H*)*, and let* ψ ∈ *H be a unit vector. There exists a unique probability measure* μψ *on the spectrum* σ(*a*) *such that*

$$
\langle \Psi, f(a)\Psi \rangle = \int\_{\sigma(a)} d\mu\_{\Psi} f, \ f \in C(\sigma(a)). \tag{4.6}
$$

*In terms of the spectral projections e*<sup>Δ</sup> = 1<sup>Δ</sup> (*a*) *(defined for Borel sets* Δ ⊆ σ(*a*)*) constructed in* (B.305) *-* (B.307) *and Theorem B.102, the Born measure is given by*

$$\mu\_{\Psi}(\Delta) = \left\| e\_{\Delta} \Psi \right\|^{2}. \tag{4.7}$$

*More generally, a density operator* ρ ∈ D(*H*) *induces a unique probability measure* μρ *on* σ(*a*) *for which*

$$\operatorname{Tr}\left(\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{?}}}}}}}}}}}}}}\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{?}}}}}}}}}}}}}}}}\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{?}}}}}}}}}}}}}}}}}\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{?}}}}}}}}}}}}}}}}}}}}$$

$$
\mu\_{\mathcal{P}}(\Delta) = \text{Tr}\,(\rho e\_{\Delta}).\tag{4.9}
$$

*This measure on* σ(*a*) *is called the* Born measure *(defined by a and* ψ *or* ρ*).*

*Proof.* The point is that the map *f* → ψ, *f*(*a*)ψ defines a state on *C*(σ(*a*)):


To prove (4.7), use Lemma B.97 to approximate 1<sup>Δ</sup> by functions *fn* ∈ *C*(σ(*a*)) as stated. By Theorem B.13.2 (i.e., the Lebesgue Monotone Convergence Theorem), we have <sup>σ</sup>(*a*) *<sup>d</sup>*μψ *fn* <sup>→</sup> <sup>σ</sup>(*a*) *d*μψ 1<sup>Δ</sup> = μψ(Δ), whereas by (B.315) with *an* = *fn*(*a*), one has ψ, *fn*(*a*)ψ→ψ, *<sup>e</sup>*Δψ <sup>=</sup> *e*Δψ2. Hence (4.7) follows from (4.6).

The proof for density operators is analogous. -

Defining the mean value *a*<sup>ψ</sup> of *a* with respect to the Born measure μψ by

$$
\langle a \rangle\_{\Psi} = \int\_{\sigma(a)} d\mu\_{\Psi}(\mathbf{x}) \,\mathbf{x}, \tag{4.10}
$$

and similarly for ρ, using Theorem 4.3.2 we easily obtain

$$
\langle a \rangle\_{\Psi} = \langle \Psi, a\Psi \rangle;\tag{4.11}
$$

$$
\langle a \rangle\_{\mathcal{P}} = \text{Tr}(\not p a). \tag{4.12}
$$

As an important special case, suppose that σ(*a*) = σ*p*(*a*) (i.e., each λ ∈ σ(*a*) is an eigenvalue); this always happens if *H* is finite-dimensional. Eq. (A.57) then gives

$$\langle \Psi, f(a)\Psi \rangle = \sum\_{\lambda \in \sigma(a)} f(\lambda) \cdot \left||e\_{\lambda}\Psi||^{2},\right|$$

where *e*<sup>λ</sup> is the projection onto the eigenspace *H*<sup>λ</sup> = {ψ ∈ *H* | *a*ψ = λψ}. Thus

$$\mu\_{\Psi}(\mathcal{X}) = \left\| e\_{\lambda} \Psi \right\|^{2},\tag{4.13}$$

and using the notation *P*ψ(*a* = λ) for μψ(λ), eq. (4.11) just becomes

$$\langle a \rangle\_{\Psi} = \sum\_{\lambda \in \sigma(a)} \lambda \cdot P\_{\Psi}(a = \lambda). \tag{4.14}$$

It is customary to extend the Born measure on σ(*a*) ⊂ R to a (probability) measure μ <sup>ψ</sup> on all of R by simply stipulating that

$$
\mu\_{\Psi}'(\Delta) = \mu\_{\Psi}(\Delta \cap \sigma(a));
\tag{4.15}
$$

we will often assume this and omit the prime. This obviously implies that μψ(Δ) = 0 for any Borel set Δ ⊂ R disjoint from σ(*a*); in particular, if σ(*a*) is discrete, then μψ is concentrated on the eigenvalues λ of *a*, in that

$$\mu\_{\Psi}(\Delta) = \sum\_{\lambda \in \Delta \cap \sigma(a)} \mu\_{\Psi}(\lambda). \tag{4.16}$$

To state an interesting property of the Born measure we need Hausdorff's solution to the relevant special case of the famous *Hamburger Moment Problem*:

Theorem 4.5. *If K* ⊂ R *is compact, then any finite measure* μ *on K is determined by its* moments

$$\alpha\_n = \int\_K d\mu(\mathbf{x}) \mathbf{x}^n. \tag{4.17}$$

Using *f*(*x*) = *x<sup>n</sup>* in (4.6), we therefore obtain:

Corollary 4.6. *The Born measure* μψ *is determined by its moments*

$$
\alpha\_n = \langle \Psi, a^n \Psi \rangle. \tag{4.18}
$$

More precisely, we need to be sure that numbers (α*n*) of the kind (4.18) are the moments of some (probability) measure. This follows from the spectral theorem by running the above argument backwards, but one may also use the general solution of the Hamburger Moment Problem, which we here state without proof:

Theorem 4.7. *A sequence of real numbers* (α*n*) *forms the moments of some measure* <sup>μ</sup> *on* <sup>R</sup> *iff for all N* <sup>∈</sup> <sup>N</sup> *and* (β1,...,β*N*) <sup>∈</sup> <sup>C</sup>*<sup>N</sup> one has* <sup>∑</sup>*<sup>N</sup> <sup>n</sup>*,*m*=<sup>0</sup> <sup>β</sup>*n*β*m*α*n*+*<sup>m</sup>* <sup>≥</sup> <sup>0</sup>*. Furthermore, if there are constants C and D such that* <sup>|</sup>α*n*| ≤ *CDnn*!*, then* <sup>μ</sup> *is uniquely determined by its moments* (α*n*)*.*

These conditions are easily checked from (4.18).

If *a* is unbounded, but still assumed to be self-adjoint (in the sense appropriate for unbounded operators, cf. Definition B.70), the spectrum σ(*a*) remains real (see Theorem B.93) but it is no longer compact. Nonetheless, the Born measure on σ(*a*) may be constructed in almost exactly the same way as in the bounded case, this time invoking Corollary B.21 and Theorem B.158 instead of Theorems 4.2 and B.94, respectively. Corollary 4.4 then holds almost *verbatim* for the unbounded case:

Corollary 4.8. *Let H be a Hilbert space, let a*<sup>∗</sup> = *a, and let* ψ ∈ *H be a unit vector. There exists a unique probability measure* μψ *on the spectrum* σ(*a*) *such that*

$$
\langle \Psi, f(a)\Psi \rangle = \int\_{\sigma(a)} d\mu\_{\Psi} f, \ f \in \mathcal{C}\_0(\sigma(a)). \tag{4.19}
$$

*Also, eqs.* (4.7) *and* (4.9) *hold, as does* (4.8)*, with f* ∈ *C*0(σ(*a*))*.*

There is no need to worry about domains, since even if *a* is unbounded, *f*(*a*) is bounded for *f* ∈ *Cb*(σ(*a*)), and hence also for *f* ∈ *C*0(σ(*a*)).

The physical relevance of the Born measure is given by the *Born rule*:

*If an observable a is measured in a state* ρ*, then the probability P*<sup>ρ</sup> (*a* ∈ Δ) *that the outcome lies in* Δ ⊂ R *is given by the Born measure* μρ *defined by a and* ρ*, i.e.,*

$$P\_{\mathcal{P}}(a \in \Delta) = \mu\_{\mathcal{P}}(\Delta). \tag{4.20}$$

As in the finite-dimensional case, the Born measure may be generalized to families (*a*1,...,*an*) of commuting self-adjoint operators. Assuming these are bounded, the C\*-algebra *C*∗(*a*1,...,*an*) is defined in the obvious way, i.e., as the smallest C\* algebra containing each *ai*, or, equivalently, as the norm-closure of the algebra of all finite polynomials in the (*a*1,...,*an*). This C\*-algebra is commutative, as a simple approximation argument shows: polynomials in the *ai* obviously commute, and this property extends to the closure by continuity of multiplication. However, even in the bounded case, the correct notion of a joint spectrum is not obvious. In order to motivate the following definition, it helps to recall Definition 1.4, Theorem C.24, and especially the last sentence before the proof of the latter, making the point that the spectrum σ(*a*) of a single (bounded) self-adjoint operator coincides with the image of the Gelfand spectrum Σ(*C*∗(*a*)) in C under the map ω → ω(*a*).

Definition 4.9. *1. The* joint spectrum <sup>σ</sup>(*a*) = <sup>σ</sup>(*a*1,...,*an*) <sup>⊂</sup> <sup>R</sup>*<sup>n</sup> of a finite family a* = (*a*1,...,*an*) *of commuting bounded self-adjoint operators is the image of the Gelfand spectrum* Σ(*C*∗(*a*1,...,*an*)) = Σ(*C*∗(*a*)) *under the map*

$$\Sigma(\mathbb{C}^\*(a\_1, \dots, a\_n)) \to \mathbb{R}^n, \text{ } \mathfrak{o} \mapsto (\mathfrak{o}(a\_1), \dots, \mathfrak{o}(a\_n)).\tag{4.21}$$

*Since* ω(*ai*) *only utilizes the restriction of* ω *to C*∗(*ai*) ⊂ *C*∗(*a*)*, we have* ω(*ai*) ∈ <sup>σ</sup>(*ai*) <sup>⊂</sup> <sup>R</sup>*, so that* <sup>Σ</sup>(*C*∗(*a*)) <sup>⊆</sup> <sup>σ</sup>(*a*1)×···×σ(*an*) *is a compact subset of* <sup>R</sup>*n.*

To justify this definition, we note that:


$$\lim\_{k \to \infty} \left\|(a\_i - \lambda\_i)\Psi\_k\right\| = 0,\tag{4.22}$$

for each *i* = 1,...,*n*. The proof is similar.

One way to see the second claim is to use Proposition C.14 joined with the observation that, as in the case of *A* = *B*(*H*) for finite-dimensional *H*, any pure state on a finite-dimensional C\*-algebra *A* ⊂ *B*(*H*) is a vector state (2.42), too. To see this, we first specialize Theorem C.133 to the finite-dimensional case (where the proof becomes elementary), so that each state on *C*∗(*a*) takes the form (2.33). Subsequently, we use the spectral decomposition (2.6), and use the definition of purity: suppose ω(*b*) = Tr(ρ*b*) = ∑*<sup>i</sup> pi*υ*i*,*b*υ*i* ≡ ∑*<sup>i</sup> pi*ωυ*<sup>i</sup>* (*b*) is pure, where *b* ∈ *C*∗(*a*). Then ωυ*<sup>i</sup>* = ω for each *i*, so that ω is a vector state, say ω(*b*) = ψ,*b*ψ where ψ is one of the υ*i*. Once we know this, suppose λ = (λ1,...,λ*n*) ∈ σ(*a*), with λ*<sup>i</sup>* = ω(*ai*). Multiplicativity of ω implies that for any finite polynomial in *n* real variables we have ψ, *p*(*a*)ψ = *p*(λ), which easily gives *ai*ψ = λ*i*ψ for each *i*; for example, take *<sup>p</sup>*(*x*)=(*xi* <sup>−</sup>λ*i*)2, so that the previous equation gives (*ai* <sup>−</sup>λ*i*)ψ<sup>2</sup> <sup>=</sup> 0.

Conversely, if λ is a joint eigenvalue of *a*, then by definition there exists a joint eigenvector ψ whose vector state ω(*b*) = ψ,*b*ψ on *C*∗(*a*) is multiplicative.

Using this (perhaps contrived) notion of a joint spectrum, Theorem 2.19 now holds by construction also if dim(*H*) = ∞, where the pertinent isomorphism *f* → *f*(*a*) is given as in the single operator case, that is, by starting with polynomials and using a continuity argument to pass to arbitrary continuous functions.

Theorem 2.18 and Corollary 4.4 then generalize to:

Theorem 4.10. *Let H be a Hilbert space, let a* = (*a*1,...,*an*) *be a finite family of commuting bounded self-adjoint operators, and let* ψ ∈ *H be a unit vector. There exists a unique probability measure* μψ *on the joint spectrum* σ(*a*) *such that*

$$\langle \Psi, f(\underline{a})\Psi \rangle = \int\_{\sigma(\underline{a})} d\mu\_{\Psi} f, \; f \in \mathcal{C}(\sigma(\underline{a})), \tag{4.23}$$

*or, equivalently, for special Borel sets* Δ = Δ<sup>1</sup> ×···×Δ*<sup>n</sup>* ⊆ σ(*a*)*, where* Δ*<sup>i</sup>* ⊂ σ(*ai*)*,*

$$\mu\_{\Psi}(\underline{\mathbf{A}}) = \left\| e\_{\Delta\_{1}} \cdots e\_{\Delta\_{n}} \Psi \right\|^{2},\tag{4.24}$$

*where the e*Δ*<sup>i</sup>* = 1Δ*<sup>i</sup>* (*ai*) *are the pertinent spectral projections (which commute).*

Similarly for density operators instead of pure states.

If (some of) the operators *ai* are unbounded, we use the trick of §B.21 and pass to their bounded transforms *bi*, see Theorem B.152. We say that the *bi commute* iff the corresponding bounded operators *bi* do; this is equivalent to commutativity of all spectral projections of the *ai*. We then define, in self-explanatory notation,

$$\sigma(\underline{a}) = \{ \underline{\lambda} (1 - \underline{\lambda}^2)^{-1/2} \mid \underline{\lambda} \in \sigma(\underline{b}) \cap (-1, 1)^n \}. \tag{4.25}$$

This leads to Born measures on σ(*a*) defined either as in (4.23), with *f* ∈ *C*(σ(*a*)) replaced by *f* ∈ *C*0(σ(*a*)), cf. (4.19), or as in (4.24).

For example, if *H* = *L*2(R*n*) and *ai*ψ(*x*) = *xi*ψ(*x*), defined on the domain

$$D(a\_l) = \{ \Psi \in L^2(\mathbb{R}^n) \mid \int\_{\mathbb{R}^n} d^n \mathbf{x} \mathbf{x}\_l^2 |\Psi(\mathbf{x})|^2 < \infty \},\tag{4.26}$$

as in (B.242), then *bi*ψ(*x*) = *xi*(1+*x*<sup>2</sup> *<sup>i</sup>* )−1/2ψ(*x*), so that <sup>σ</sup>(*b*)=[−1,1] *<sup>n</sup>* and hence <sup>σ</sup>(*a*) = <sup>R</sup>*n*. For a measurable region <sup>Δ</sup> <sup>⊂</sup> <sup>R</sup>*<sup>n</sup>* we then have Pauli's famous formula

$$
\mu\_{\Psi}(\underline{\Delta}) = \int\_{\underline{\Delta}} d^n x \, |\Psi(x)|^2 \tag{4.27}
$$

for finding the particle in the region Δ, given that the system is in a pure state ψ.

#### 4.2 Density operators and normal states

Definition 2.4 of a state still makes good sense in the infinite-dimensional case, as it simply specializes the general definition of a state on a C\*-algebra *A* to the case *A* = *B*(*H*). Thus we continue to say that a state on *B*(*H*) is a complex-linear map ω : *B*(*H*) → C satisfying ω(*b*∗*b*) ≥ 0 for each *b* ∈ *B*(*H*) and ω(1*H*) = 1. Despite this lack of novelty in the definition of a state (i.e., compared to finite-dimensional Hilbert spaces), Theorem 2.7 no longer holds if *H* is infinite-dimensional: although it (almost trivially) remains true that density operators ρ on *H* define states on *B*(*H*) through the fundamental correspondence ω(*a*) = Tr(ρ*a*), *a* ∈ *B*(*H*), cf. (2.33), there are (many) states that are *not* given in that way (see below). Fortunately, states that *do* arise through (2.33) can be characterized in a simple way.

Definition 4.11. *A state* ω : *B*(*H*) → C *is called* normal *if for each orthogonal family* (*ei*) *of projections (i.e., e*<sup>∗</sup> *<sup>i</sup>* = *ei and eiej* = δ*i jei) one has*

$$\text{co}\left(\sum\_{i} e\_{i}\right) = \sum\_{i} \text{co}(e\_{i}).\tag{4.28}$$

*Here* ∑*<sup>i</sup> ei is defined as the projection on the smallest closed subspace K of H that contains each eiH (that is,* ∑*<sup>i</sup> ei* = ∨*iei, i.e., the supremum in the poset* P(*H*) *of all projections on H with respect to the partial order e* ≤ *f iff eH* ⊆ *f H). Furthermore, the sum over i on the right-hand side is defined by* (B.11)*, i.e., as the supremum (in* R*) of the set of all sums* ∑*i*∈*<sup>F</sup>* ω(*ei*) *over finite subsets F* ⊂ *I of the index set I in which i takes values. It is finite because* ∑*i*∈*<sup>F</sup> ei* ≤ 1*<sup>H</sup> and hence, since* ω *is positive,*

$$\sum\_{i \in F} \mathcal{o}(e\_i) \le \mathcal{o}(1\_H) = 1.$$

For example, let (υ*i*) be a basis of *H* with associated one-dimensional projections

$$e\_l = |\mathfrak{v}\_l\rangle\langle\mathfrak{v}\_l|.\tag{4.29}$$

If ω is assumed to be a state, then the additivity condition (4.28) implies

$$\sum\_{i} \mathcal{O}(e\_i) = 1,\tag{4.30}$$

or, equivalently, using Definition B.6 etc. as well as the notation *eF* ≡ ∑*i*∈*<sup>F</sup> ei*,

$$\lim\_{F} \mathfrak{o}(e\_F) = 1.\tag{4.31}$$

If *H* is separable, any orthogonal family (*ei*) of projections is necessarily countable, and (4.28) is analogous to the countable additivity condition defining a measure.

Theorem 4.12. *A state* ω *on B*(*H*) *takes the form* ω(*a*) = Tr(ρ*a*) *for some (unique) density operator* ρ ∈ D(*H*) *iff it is normal.*

*Proof.* First, eq. (2.33) implies (4.28). To see this, take the trace with respect to some basis (υ*j*) of *H* that is *adapted* to the family (*ei*) in the sense that for each *j*, either *ei*υ*<sup>j</sup>* = υ*<sup>j</sup>* (i.e., υ*<sup>j</sup>* ∈ *eiH*) for one value of *i*, or *ei*υ*<sup>j</sup>* = υ*<sup>j</sup>* for all *i*. Then

$$\rho a \left(\sum\_{i} e\_i\right) = \text{Tr}\left(\rho \sum\_{i} e\_i\right) = \sum\_{j} \langle \upsilon\_j, \rho \sum\_{i} e\_i \upsilon\_j \rangle = \sum\_{j}' \langle \upsilon\_j, \rho \upsilon\_j \rangle,$$

where the sum ∑ *<sup>j</sup>* is over those *j* for which υ*<sup>j</sup>* ∈ *K* ≡ ∨*ieiH*. On the other hand, since the basis is adapted, we have υ*<sup>j</sup>* ∈ *K* iff there is an *i* for which *ei*υ*<sup>j</sup>* = υ*<sup>j</sup>* (since otherwise *ei*υ*<sup>j</sup>* = 0 and hence υ*<sup>j</sup>* ⊥ *eiH* for each *i*, so that υ*<sup>j</sup>* ∈ *K*⊥), so

$$\sum\_{i} \mathfrak{o}(e\_i) = \sum\_{i} \text{Tr} \left( \mathfrak{o} e\_i \right) = \sum\_{i} \sum\_{j} \langle \mathfrak{v}\_j, \mathfrak{o} e\_i \mathfrak{v}\_j \rangle = \sup\_{F \subset I} \sum\_{j \in J\_F} \langle \mathfrak{v}\_j, \mathfrak{o} \mathfrak{v}\_j \rangle = \sum\_{j}^{\prime} \langle \mathfrak{v}\_j, \mathfrak{o} \mathfrak{v}\_j \rangle,$$

where *JF* consists of those *j* for which υ*<sup>j</sup>* ∈ ∑*i*∈*<sup>F</sup> eiH*. This gives (4.28).

Conversely, assume ω is normal. For the *ei* in (4.28) we now take the projections (4.29) determined by some basis (υ*i*). For each *a* ∈ *B*(*H*) we then have

$$\mathfrak{so}(a) = \lim\_{F} \mathfrak{o}(e\_F a). \tag{4.32}$$

Indeed, using Cauchy–Schwarz for the positive semi-definite form (*a*,*b*) = ω(*a*∗*b*), as in (C.197), and using ∑*<sup>i</sup> ei* = 1*<sup>H</sup>* and hence ω(*a*) = ω(∑*<sup>i</sup> eia*) we have

$$|\mathfrak{a}(a) - \mathfrak{a}(e\_F a)|^2 = |\mathfrak{a}(e\_{F^c} a)|^2 \le \mathfrak{a}(a^\* a)\mathfrak{a}(e\_{F^c}) \le \|a\|^2 \mathfrak{a}(e\_{F^c}),\tag{4.33}$$

since *eFc* ≡ ∑*i*∈/*<sup>F</sup> ei* is a projection. Since ω(*eF*) +ω(*eFc* ) = ω(1*H*) = 1, eq. (4.31) gives lim*<sup>F</sup>* ω(*eFc* ) = 0, so that (4.33) gives (4.32). For each finite *F* ⊂ *I*, the operator *eFa* has finite rank and hence is compact. According to Theorem B.146, the restriction of ω : *B*(*H*) → C to the C\*-algebra *B*0(*H*) of compact operators on *H* is induced by a trace-class operator ρ, which (from the requirement that ω be a state) must be a density operator. Hence ω(*eFa*) = Tr(ρ*eFa*), and we finally have

$$\mathfrak{o}(a) = \lim\_{F} \mathfrak{o}(e\_F a) = \lim\_{F} \mathrm{Tr}(\mathfrak{\rho} e\_F a) = \mathrm{Tr}(\mathfrak{\rho} a). \tag{4.34}$$

To derive the final equality, we rewrite Tr(ρ*eFa*) = Tr(*eFa*ρ), cf. (A.78) and Proposition B.144, note that *a*ρ ∈ *B*1(*H*), as shown in Corollary B.147, and observe that for any *b* ∈ *B*1(*H*) we have lim*<sup>F</sup>* Tr(*eFb*) = Tr(*b*). To see this, simply compute the trace in the basis (υ*i*) defining the projections *ei* through (4.29), so that Tr(*eFb*) = ∑*i*∈*F*υ*i*,*b*υ*i*, and note that by Definition B.6,

$$\lim\_{F} \sum\_{i \in F} \langle \mathfrak{v}\_i, b\mathfrak{v}\_i \rangle = \sum\_i \langle \mathfrak{v}\_i, b\mathfrak{v}\_i \rangle = \text{Tr} \,(b).$$

Finally, suppose ω(*a*) = Tr(ρ1*a*) = Tr(ρ2*a*) for each *a* ∈ *B*(*H*) and hence for each *a* ∈ *B*0(*H*). It follows from (B.476) that Tr(ρ*a*) = 0 for all *a* ∈ *B*0(*H*) iff ρ = 0. Hence ρ<sup>1</sup> = ρ2, i.e., a normal state ω uniquely determines a density operator ρ. -

If ω is normal, we may therefore use the spectral resolution (2.6) of the corresponding density operator ρ, i.e., ρ = ∑*<sup>i</sup> pi*|υ*i*υ*i*|, where (υ*i*) is some basis of *H* consisting of eigenvectors of ρ (which exists because ρ is compact and self-adjoint), and the corrsponding eigenvalues satisfy *pi* ≥ 0 and ∑*<sup>i</sup> pi* = 1; see the explanation after Definition B.148. Computing the trace in the same basis gives

$$\operatorname{Tr}\left(\rho a\right) = \sum\_{i} p\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle. \tag{4.35}$$

We may characterize normality in a number of other ways. First note that because of the duality *B*1(*H*)<sup>∗</sup> ∼= *B*(*H*) of Theorem B.146, cf. (B.477), we may equip *B*(*H*) with the *w*∗-topology in its role as the dual of the trace-class operators *B*1(*H*), see §B.9; this means that *a*<sup>λ</sup> → *a* iff Tr(ρ*a*<sup>λ</sup> ) → Tr(ρ*a*) for each ρ ∈ *B*1(*H*), or, equivalently, for each ρ ∈ D(*H*), since each trace-class operator is a linear combination of at most four density operators, as follows from Lemma C.53 with (C.8) - (C.9). The *w*∗-topology on *B*(*H*), seen as the dual of *B*1(*H*), is called the σ*-weak topology*. By Proposition B.46, the σ-weakly continuous linear functionals ϕ on *B*(*H*) are just those given by ϕ(*a*) = Tr(ρ*b*) for some trace-class operator *b* ∈ *B*1(*H*).

Secondly, *B*(*H*) is *monotone complete*, in the sense that each net (*a*<sup>λ</sup> ) of positive operators that is bounded (i.e., 0 ≤ *a*<sup>λ</sup> ≤ *c* · 1*<sup>H</sup>* for some *c* > 0 and all λ ∈ Λ) and increasing (in that *a*<sup>λ</sup> ≤ *a*<sup>λ</sup> whenever λ ≤ λ ) has a supremum *a* with respect to the standard ordering ≤ on *B*(*H*)+, which supremum coincides with the strong limit of the net (i.e., lim<sup>λ</sup> *a*λψ = *a*ψ for each ψ ∈ *H*); the proof is the same as for Proposition B.98, and also here we write *a*<sup>λ</sup> \$ *a* to describe this entire situation.

Corollary 4.13. *The following conditions on a state* ω ∈ *S*(*B*(*H*)) *are equivalent:*


*Proof.* We have seen 1 ↔ 3 ↔ 4, and 2 → 1 is obvious, so establishing 3 → 2 would complete the proof. To this effect, we first note that because the sum (4.35) is convergent, for ε > 0 we may find a finite subset *F* ⊂ *I* for which ∑*i*∈/*<sup>F</sup> pi* < ε/2*a* (assuming *a* = 0). Since 0 ≤ *a*<sup>λ</sup> ≤ *a* also implies *a*<sup>λ</sup> ≤ *a*· 1*<sup>H</sup>* (since *a* ≤ *a*· 1*H*), we therefore have |∑*i*∈/*<sup>F</sup> pi*υ*i*,(*a*<sup>λ</sup> −*a*)υ*i*| < 2ε/3, uniformly in λ. Moreover, since *F* is finite and *a*<sup>λ</sup> → *a* strongly, we can find λ<sup>0</sup> such that for all λ ≥ λ<sup>0</sup> we have

$$|\sum\_{i \in F} p\_i \langle \mathfrak{v}\_i, (a\_\lambda - a)\mathfrak{v}\_i \rangle| < \mathfrak{e}/3. \tag{4.36}$$

Consequently, for such λ,

$$|\mathrm{Tr}\left(\mathfrak{p}(a\_{\lambda}-a)\right)| \leq |\sum\_{i \in F} p\_i \langle \mathfrak{v}\_i, (a\_{\lambda}-a)\mathfrak{v}\_i \rangle| + |\sum\_{i \notin F} p\_i \langle \mathfrak{v}\_i, (a\_{\lambda}-a)\mathfrak{v}\_i \rangle| \\ < \frac{2}{3}\mathfrak{c} + \frac{1}{3}\mathfrak{c} = \mathfrak{c}.$$

This shows that lim<sup>λ</sup> |Tr(ρ(*a*<sup>λ</sup> −*a*))| = 0, so that assumption 3 implies no. 2. -

We denote the *normal state space* of *B*(*H*), i.e., the set of all normal states on *B*(*H*) by *Sn*(*B*(*H*)). It is easy to see from Definition B.148 that *Sn*(*B*(*H*)) is a convex (but not necessarily compact!) subset of the total state space *S*(*B*(*H*)).

Corollary 4.14. *The relation* ω(*a*) = Tr(ρ*a*) *induces an isomorphism*

$$\mathcal{S}\_n(B(H)) \cong \mathcal{Q}(H) \tag{4.37}$$

*of convex sets (i.e.,* ω ↔ ρ*). Furthermore, for the corresponding pure states we have*

$$P\_n(B(H)) \cong \mathcal{P}\_1(H),\tag{4.38}$$

*i.e., any pure state* ω *on B*0(*H*)*, as well as any normal pure state on B*(*H*)*, is given by* ω = ωψ *for some unit vector* ψ ∈ *H, where* ω(*a*) = ψ,*a*ψ*, cf.* (2.42)*.*

The proof of (4.38) is practically the same as in the finite-dimensional case. From Theorem B.146 we obtain another characterization of *Sn*(*B*(*H*)) and hence of D(*H*):

Corollary 4.15. *If B*0(*H*) *is the C\*-algebra of compact operators on H, we have*

$$S(B\_0(H)) = S\_n(B(H));\tag{4.39}$$

$$P(B\_0(H)) = P\_n(B(H)),\tag{4.40}$$

*in the sense that any (pure) state* ω *on B*0(*H*) *has a unique* normal *extension to a (pure) state* ω *on B*(*H*)*, given by the same density operator* ρ *that yields* ω*.*

It can be shown that any state ω ∈ *S*(*B*(*H*)) has a convex decomposition

$$
\alpha \bullet \alpha = t \bullet \mathfrak{a}\_{\mathfrak{t}} + (1 - t) \mathfrak{a}\_{\mathfrak{s}}, \tag{4.41}
$$

where *t* ∈ [0,1], ω*<sup>n</sup>* is a normal state, and ω*<sup>s</sup>* is called a *singular state*. In particular, since for *t* ∈ (0,1) the state ω is mixed, *a pure state is either normal or singular*.

Singular states are not as aberrant as the terminology may suggest: such states are routinely used in the physics literature and are typically denoted by |λ, where λ lies in the continuous spectrum of some self-adjoint operator (that has to be maximal for this notation to even begin to make sense, see §4.3 below). Examples of such "improper eigenstates" are |*x* and |*p*, which many physicists regard as idealizations. However, mathematically such states are at least defined, namely as singular pure states on *B*(*H*). The key to the existence of such states lies in Proposition C.15 and its proof, which should be reviewed now; we only need the case *a*∗ = *a*.

Proposition 4.16. *Let a* = *a*<sup>∗</sup> ∈ *B*(*H*) *have non-empty continuous spectrum, so that there is some* λ ∈ σ(*a*) *that is not an eigenvalue of a. Then* ωλ (*f*(*a*)) = *f*(λ) *defines a pure state on A* = *C*∗(*a*)*, whose extension to B*(*H*) *by any pure state is singular.*

*Proof.* Normal pure states on *B*(*H*) take the form ωψ(*b*) = ψ,*b*ψ, where ψ ∈ *H* is a unit vector and *b* ∈ *B*(*H*). We know from Proposition C.14 that ωλ is multiplicative on *C*∗(*a*). However, if some multiplicative state ω on *C*∗(*a*) has the form ω = ωψ, then ψ must be eigenvector of *a*; cf. the proof of Proposition 2.3. -

#### 4.3 The Kadison–Singer Conjecture

To obtain deeper insight into singular pure states, and as a matter of independent interest, we return to the Kadison–Singer problem, cf. §2.6. Recall that this problem asks if some abelian unital C\*-algebra *A* ⊂ *B*(*H*) has the *Kadison–Singer property*, stating that a pure state ω*<sup>A</sup>* on *A* has a *unique* pure extension ω to *B*(*H*). Here the issue is uniqueness rather than existence, since at least one such extension exists: since *A* is necessarily unital (with 1*<sup>A</sup>* = 1*H*) and ω*<sup>A</sup>* is a state on *A*, so that in particular ω*A*(1*A*) = ω*A* = 1, Corollary B.41 gives the existence of a bounded extension ω satisfying ω(1*H*) = ω = 1, which by Proposition C.5 is a state on *B*(*H*). Proposition 2.22 then gives the existence of a *pure* extension ω. As in the finite-dimensional case, the Kadison–Singer property forces *A* to be maximal (in the poset C (*B*(*H*)) of all abelian unital C\*-subalgebras of *B*(*H*), ordered by inclusion):

Proposition 4.17. *If some abelian unital C\*-subalgebra A of B*(*H*) *has the Kadison– Singer property, then A is necessarily maximal.*

*Proof.* We use the Gelfand isomorphism *A* ∼= *C*(*P*(*A*)), where *P*(*A*) is the pure state space of *A*, cf. Theorem C.8 and Proposition C.14. If *A* has the Kadison–Singer property and *A* ⊆ *B* ⊂ *B*(*H*), where *B* is an abelian unital C\*-subalgebra *A* of *B*(*H*), then ω*<sup>A</sup>* has a unique pure extension ω on *B*(*H*), which restricts to some state ω*<sup>B</sup>* on *B*. The same reasoning as in the proof of Proposition 2.22 shows that ω*<sup>B</sup>* is a pure state on *B*, so that we obtain a unique map

$$P(A) \longleftrightarrow P(B);\tag{4.42}$$

ω*<sup>A</sup>* → ω*B*. (4.43)

The inverse of this map is simply the pullback of the inclusion *A* → *B*, i.e., ω*<sup>B</sup>* ∈ *P*(*B*) defines ω*<sup>A</sup>* ∈ *P*(*A*) by restriction, so that we have a bijection *P*(*A*) ∼= *P*(*B*), ω*<sup>A</sup>* ↔ ω*B*. Since for any pair of C\*-algebras *A* ⊆ *B* the pullback *S*(*B*) → *S*(*A*) is continuous (in the pertinent *w*∗-topology), the map ω*<sup>B</sup>* → ω*<sup>A</sup>* is continuous. As in Lemma C.20, this implies that it is in fact a homeomorphism, so that *A* ∼= *B* through the inclusion *A* → *B*. This gives *A* = *B*, and hence *A* is maximal. -

Maximality of *A* implies *A* = *A*, so that *A* is a von Neumann algebra, sharing the unit of *B*(*H*). To see the relevance of singular states for the Kadison–Singer problem, we first settle the normal case. We know what it means for a state on *B*(*H*) to be normal (cf. Definition 4.11 and Corollary 4.13); for arbitrary von Neumann algebras *A* ⊂ *B*(*H*) the situation is exactly the same: we *define* normality by (4.28) and *characterize* it by the equivalent properties in Corollary 4.13, where the σ-weak topology on *A* may be defined either as the one inherited from *B*(*H*), or, more intrinsically, and the *w*∗-topology from the duality *A* = *A*∗ <sup>∗</sup>, where the Banach space *A*<sup>∗</sup> is the so-called predual of *A*, e.g., -∞ <sup>∗</sup> ∼= -<sup>1</sup> and *<sup>L</sup>*∞(0,1)<sup>∗</sup> <sup>=</sup> *<sup>L</sup>*1(0,1), cf. §B.9.

Theorem 4.18. *Let H be a separable Hilbert space and let* ω*<sup>A</sup> be a* normal *pure state on a maximal commutative unital C\*-algebra A in B*(*H*)*. Then* ω*<sup>A</sup> has a unique extension to a state* ω *on B*(*H*)*, which is necessarily pure and normal.*

*Proof.* As noted after (4.41), a pure state on *B*(*H*) is either normal or singular. The possibility that ω*<sup>A</sup>* is normal whereas ω is singular is excluded by Corollary 4.13.3, so ω must be normal and hence given by a density operator. The proof of uniqueness is then the same as in the finite-dimensional case, cf. Theorem 2.21. -

We now recall the classification of maximal maximal abelian ∗-algebras (and hence of maximal abelian von Neumann algebras) *A* in *B*(*H*) up to unitary equivalence (cf. Theorem B.118). This classification is the relevant one for the Kadison– Singer problem, since, as is easily seen, *A* ⊂ *B*(*H*) has the Kadison–Singer property iff *uAu*−<sup>1</sup> <sup>⊂</sup> *<sup>B</sup>*(*uH*) has it. The uniqueness of the finite-dimensional case will be lost:

Theorem 4.19. *If H is separable and infinite-dimensional, and A* ⊂ *B*(*H*) *is a maximal abelian* ∗*-algebra, then A is unitarily equivalent to exactly one of the following:*

*1. L*∞(0,1) <sup>⊂</sup> *<sup>B</sup>*(*L*2(0,1))*; 2.* -<sup>∞</sup> <sup>⊂</sup> *<sup>B</sup>*(-2)*; 3. L*∞(0,1)⊕-<sup>∞</sup>(κ) <sup>⊂</sup> *<sup>B</sup>*(*L*2(0,1)⊕-<sup>2</sup>(κ))*,*

*where* -<sup>∞</sup> <sup>≡</sup> -<sup>∞</sup>(N)*,* -<sup>2</sup> <sup>≡</sup> -<sup>2</sup>(N)*, and* <sup>κ</sup> *is either* {1,...,*n*}*, in which case* -<sup>2</sup>(κ) = C*<sup>n</sup> and* -<sup>∞</sup>(κ) = *Dn*(C)*, or* κ = N*, in which case* -<sup>2</sup>(κ) = -<sup>2</sup> *and* -<sup>∞</sup>(κ) = -∞*.*

This classification sheds some more light on Theorem 4.18. Since *L*∞(0,1) has no pure normal states and *Dn*(C) has been dealt with in Theorem 2.21, the interesting case is -<sup>∞</sup>. Using Corollary 4.13.3 (or the analysis below), it is easy to check that the normal pure states on -<sup>∞</sup> are given by <sup>ω</sup>*A*(*f*) = *<sup>f</sup>*(*x*) for some *<sup>x</sup>* <sup>∈</sup> <sup>N</sup>; these are vector state of the kind ω*A*(*f*) = ψ,*mf*ψ with ψ = δ*x*, or, in other words, they are given by ω*A*(*f*) = Tr(ρ*mf*) with ρ = |δ*x*δ*x*|. We now invoke a fairly deep result:

Proposition 4.20. *A pure state* ω *on B*(*H*) *is singular iff one (and hence all) of the following equivalent conditions is satisfied:*


One direction is easy: a normal pure state certainly does not satisfy the condition in question. For example, given (2.42) one may take *a* = |ψψ|, which as a onedimensional projection lies in *B*0(*H*), so that ωψ(*a*) = 1. We omit the other direction of the proof. We conclude from this proposition that a pure singular state on *B*(-2) cannot restrict to a normal pure state on -<sup>∞</sup>, which reconfirms Theorem 4.18.

We now study the Kadison–Singer property for each of the three cases in Theorem 4.19 (where the third will be an easy corollary of the first and the second). Since the proofs of the first two cases are formidable, we just sketch the argument.

Theorem 4.21. • *There exist (necessarily singular) pure states on L*∞(0,1) *that do not have a unique extension to B*(*L*2(0,1))*, and similarly for L*∞(0,1)⊕-<sup>∞</sup>(κ)*.*

• *Any pure state on* -<sup>∞</sup> *has a unique extension to B*(-2)*.* The statement about -<sup>∞</sup> is the *Kadison–Singer Conjecture*, which dates from 1959 but was only proved in 2013. The first claim (which was already known to Kadison and Singer themselves) is equally remarkable, however, as is the contrast between the two parts of Theorem 4.21. In particular, Dirac's notation |λ may be ambiguous.

The key to the proof of the first claim lies in the choice of a total countable family of normal states on *L*∞(0,1), from which all pure states may be constructed by a limiting operation. Here we call a (countable) family (ω*n*)*n*∈<sup>N</sup> of states on some C\*-algebra *A total* if, for any self-adjoint *a* ∈ *A*, the conditions ω*n*(*a*) ≥ 0 for each *n* imply *a* ≥ 0 (the converse is trivial). For example, the well-known *Haar basis* (*hn*) of *L*2(0,1) provides such a family. The functions forming this basis are defined via some bijection β between the set of pairs (*k*,*l*) and N, e.g., β(*k*,*l*) = *k* +2*<sup>l</sup>* , by

$$h\_n = \chi\_{\beta^{-1}(n)}, \ (n \in \mathbb{N} = \{1, 2, \ldots\});\tag{4.44}$$

$$\mathfrak{X}\_{k,l}(\mathbf{x}) = \mathfrak{Z}^{k/2} \mathfrak{g}(\mathfrak{Z}^k \mathbf{x} - l), \ (k \in \mathbb{N} \cup \{0\}, \mathbf{0} \le l < \mathfrak{Z}^k);\tag{4.45}$$

$$\mathbf{g(x)} = \mathbf{1}\_{[0,1/2)} - \mathbf{1}\_{[1/2,1]}.\tag{4.46}$$

Basic analysis then shows that the Haar functions *hn* form a basis of *L*2(0,1) and that the associated vector states ω*<sup>n</sup>* on *L*∞(0,1) form a total set, where obviously

$$\langle \alpha\_n(f) = \langle h\_n, m\_f h\_n \rangle = \int\_0^1 h\_n^2 f. \tag{4.47}$$

The relevance of total sets to the conjecture is explained by the following lemma.

Lemma 4.22. *If T* ⊂ *S*(*A*) *is a total set of states on a unital C\*-algebra A, then*

$$S(A) = \text{co}(T)^{-};\tag{4.48}$$

$$P(A) \subseteq T^-,\tag{4.49}$$

*where* co(*T*)− *is the w*∗*-closure of the convex hull of T in A*∗ *or in S*(*A*)*.*

*Proof.* The inclusion co(*T*)<sup>−</sup> ⊆ *S*(*A*) is obvious, since *T* ⊆ *S*(*A*) and *S*(*A*) is a compact (and hence a closed) convex set. To prove the converse inclusion, suppose *a* = *a*<sup>∗</sup> ∈ *A* and *s* ∈ R are such that ω(*a*) ≥ *s* for each ω ∈ *T*. Then ω(*a*−*s*· 1*A*) ≥ 0 and hence ω(*a*) ≥ *s* for each ω ∈ *S*(*A*). Using Theorem B.43 (of Hahn–Banach type), this property would lead to a contradiction if *S*(*A*) were not contained in co(*T*)−.

The second claim, which is the one we will use, follows from the first through a corollary of the Krein–Milman Theorem B.50, stating that if *T* ⊂ *K* is any subset of a compact convex set *K* such that *K* = co(*T*)−, then ∂*eK* ⊆ *T* <sup>−</sup>. This corollary may be proved (by contradiction) from Theorem B.43 in a similar way. -

Our next aim is to get rid of the closure in (4.49). The Haar basis yields a map

$$h: \mathbb{N} \to S(L^{\infty}(0, 1)); \tag{4.50}$$

$$m \mapsto \mathfrak{o}\_n,\tag{4.51}$$

with image *T*, i.e., the set of Haar states. Since *S*(*A*) is a compact Hausdorff space (in its *w*∗-topology), the universal property (B.135) of the Cech–Stone compactification ˇ βN of N implies that *h* extends (uniquely) to a continuous map

$$
\beta h: \mathcal{B} \mathbb{N} \to S(A),
$$

whose image is compact and hence closed (since βN is compact). Since *T* = *h*(N) ⊂ *S*(*A*) we have *T* ⊆ β*h*(βN) and hence *T* <sup>−</sup> ⊆ β*h*(βN), so that, from (4.49),

$$P(L^{\infty}(0,1)) \subseteq \mathcal{B}h(\mathcal{B}\mathbb{N}).\tag{4.52}$$

Hence each pure state <sup>ω</sup>*<sup>c</sup>* <sup>≡</sup> <sup>ω</sup>*L*∞(0,1) on *<sup>L</sup>*∞(0,1) takes the form <sup>ω</sup>*<sup>c</sup>* <sup>=</sup> <sup>ω</sup>(*U*) *<sup>c</sup>* , where

$$\mathfrak{gl}\_c^{(U)}(f) = \lim\_{U} \mathfrak{o}\_{\mathfrak{n}}(f) = \bigcap\_{A \in U} \{ \mathfrak{o}\_{\mathfrak{n}}(f) \mid n \in A \}^-, \ f \in L^\infty(0, 1), \tag{4.53}$$

and *U* ∈ βN is some ultrafilter on N, cf. (B.136). The point of this analysis, then, is that ω*<sup>U</sup>* can immediately be extended to *B*(*L*2(0,1)) by the same formula, i.e.,

$$\mathfrak{o}^{(U)}(a) = \lim\_{U} \mathfrak{o}\_{\mathfrak{n}}(a) = \bigcap\_{A \in U} \{ \mathfrak{o}\_{\mathfrak{n}}(a) \mid n \in A \}^{-}, \ a \in B(L^2(0, 1)), \tag{4.54}$$

where <sup>ω</sup>*n*(*a*) = *hn*,*ahn*. If *<sup>L</sup>*∞(0,1) had the Kadison–Singer property, this were the unique extension of ω*<sup>U</sup>* , and we will show that this leads to a contradiction.

Apart from the use of ultrafilters, the technically most challenging part of the argument disproving the Kadison–Singer property for *L*∞(0,1) is as follows. If *A* = *C*([0,1]), for any *f* ∈ *A* and any pure state ω ∈ *P*(*A*) there is some *x* ∈ [0,1] such that ω(*f*) = *f*(*x*); see Propositions C.14 and C.19. For *A* = *L*∞(0,1) the situation is not that simple due to measure zero complications. Nonetheless, it is easy to show that for each *positive f* <sup>∈</sup> *<sup>L</sup>*∞(0,1) and <sup>ω</sup>*<sup>c</sup>* <sup>∈</sup> *<sup>P</sup>*(*L*∞(0,1)) and each <sup>ε</sup> <sup>&</sup>gt; 0 one has

$$\mu(\{x \in (0,1) \mid f(x) \in [a\mathfrak{a}\_c(f) - \mathfrak{e}, a\mathfrak{a}\_c(f) + \mathfrak{e}]\}) > 0. \tag{4.55}$$

where μ is Lebesque measure on (0,1). Taking the projection

$$e = \mathbf{1}\_{\{\mathbf{x} \in (0,1) \mid f(\mathbf{x}) \in [a\mathbf{a}\_{\mathbb{C}}(f) - \mathfrak{e}/2, \mathbf{a}\mathbf{a}\_{\mathbb{C}}(f) + \mathfrak{e}/2\} \text{ }\mathbf{x}}$$

it follows that for each positive *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*∞(0,1), <sup>ω</sup> <sup>∈</sup> *<sup>P</sup>*(*L*∞(0,1)) and <sup>ε</sup> <sup>&</sup>gt; 0 there exists a projection *<sup>e</sup>* <sup>∈</sup> <sup>P</sup>(*L*∞(0,1)) with <sup>ω</sup>(*e*) = 1 and *e f* <sup>−</sup>*e*ω*c*(*f*) <sup>&</sup>lt; <sup>ε</sup>. Hard analysis then generalizes this property from *L*∞(0,1) to *B*(*L*2(0,1)), as follows:

Lemma 4.23. *If* <sup>ω</sup>*<sup>c</sup>* <sup>∈</sup> *<sup>P</sup>*(*L*∞(0,1)) *has a unique extension* <sup>ω</sup> *to B*(*L*2(0,1)) *(which is necessarily pure if it is unique), then for each a* <sup>∈</sup> *<sup>B</sup>*(*L*2(0,1)) *and* <sup>ε</sup> <sup>&</sup>gt; <sup>0</sup> *there exists a projection e* <sup>∈</sup> <sup>P</sup>(*L*∞(0,1)) *with* <sup>ω</sup>*c*(*e*) = <sup>1</sup> *and*

$$\|ea - e\mathfrak{o}(a)\| < \mathfrak{e}.\tag{4.56}$$

To derive a contradiction between (4.54) and (4.56), we use a bijection *b* : N → N that cyclically permutes the ordered subsets (2*<sup>k</sup>* +1,...,2*k*+1), *k* = 0,1,..., that is, (1,2), (3,4), (5,6,7,8), (9,...,16), etc. This bijection induces a unitary operator

$$
u: L^2(0,1) \to L^2(0,1);
\tag{4.57}$$

$$
u h\_n = h\_{b(n)},\tag{4.58}$$

which is easily shown to have the following properties:

$$a\mathfrak{o}\_n(\mu) = 0, \ n \in \mathbb{N};\tag{4.59}$$

$$\|\|eue\|\| = 1, \ e \in \mathcal{P}(L^{\infty}(0,1)), e \neq 0. \tag{4.60}$$

To show that *L*∞(0,1) fails to have the Kadison–Singer property, suppose it does, so that any <sup>ω</sup>*<sup>c</sup>* <sup>∈</sup> *<sup>P</sup>*(*L*∞(0,1)) has a unique extension <sup>ω</sup> <sup>∈</sup> *<sup>P</sup>*(*B*(*L*2(0,1))). As already noted, we may then assume that ω*<sup>c</sup>* = ω(*U*) *<sup>c</sup>* , as in (4.53), whilst ω = ω(*U*) , as in (4.54). Taking *a* = *u* then gives ω(*u*) = 0, see (4.59), so that *eu* < ε by (4.56). But this contradicts (4.60), finishing the sketch of the proof of the first claim in Theorem 4.21. The remark about *<sup>L</sup>*∞(0,1)⊕-<sup>∞</sup>(κ) follows from the one about *L*∞(0,1).

We now pass to the (even) more difficult case of -<sup>∞</sup> <sup>⊂</sup> *<sup>B</sup>*(-<sup>2</sup>). Although this will not be used in the proof, it gives some insight to know which states on -<sup>∞</sup> we are actually talking about, i.e., the singular pure states, and compare this with (4.53).

Theorem 4.24. *There is a bijective correspondence*

$$\mathfrak{a}\mathfrak{a}\_d(f) = \int\_{\mathbb{N}} d\mu \, f \,\tag{4.61}$$

*between states* ω*<sup>d</sup> on* -<sup>∞</sup> *and* finitely additive *probability measures* μ *on* N*, where:*

	- ω*<sup>d</sup> is normal iff U is principal (and hence singular iff U is free).*

This follows from case no. 5 in §B.9, notably eqs. (B.153) - (B.154). In other words, the pure states ω*<sup>d</sup>* on -<sup>∞</sup> are given by ultrafilters *U* on N through

$$\mathfrak{op}\_d^{(U)}(f) = \mathcal{B}f(U) = \lim\_{U} f(n);\tag{4.62}$$

the analogy with (4.53) is even clearer if we write *f*(*n*) = δ*n*,*mf* δ*n* ≡ ω*n*(*f*). If *U* = *Un* is a principal ultrafilter, *n* ∈ N, we thus recover the normal pure states

$$\mathfrak{op}\_d^{(U\_n)}(f) = f(n). \tag{4.63}$$

As in (4.54), we find at least one natural extension ω(*U*) of ω(*U*) *<sup>d</sup>* to *B*(-<sup>2</sup>), namely

$$\mathfrak{o}\mathfrak{o}^{(U)}(a) = \lim\_{U} \mathfrak{o}\_{\mathfrak{n}}(a). \tag{4.64}$$

We now show that that -<sup>∞</sup> has the Kadison–Singer property, making ω(*U*) the *only* extension of ω(*U*) *<sup>d</sup>* . The proof relies on an extremely difficult lemma from linear algebra (formerly known as a *paving conjecture*). We first define a linear map *D* : *Mn*(C) → *Dn*(C) by *D*(*a*)*ii* = *aii*, *i* = 1,...,*n*, and *D*(*a*)*i j* = 0 whenever *i* = *j*.

Lemma 4.25. *For any* ε > 0 *there exist l* ∈ N *such that for all n* ∈ N *and a* ∈ *Mn*(C) *with D*(*a*) = 0*, there are l projections* (*e*1,..., *el*) *in Dn*(C) *such that*

$$\sum\_{k=1}^{l} e\_k = 1\_n;\tag{4.65}$$

$$\|e\_i a e\_l\| \le \varepsilon \|a\|, \ i = 1, \ldots, l. \tag{4.66}$$

Since this estimate is uniform in *n*, the lemma extends to -2, where *D* : *B*(-<sup>2</sup>) <sup>→</sup> -∞ is defined analogously, i.e., *D*(*a*) is diagonal in the canonical basis (δ*n*) of -<sup>2</sup> with

$$D(a)\delta\_n = a\mathfrak{o}\_n(a)\delta\_n, \ n \in \mathbb{N}.\tag{4.67}$$

Lemma 4.26. *For any* ε > 0 *there exist l* ∈ N *such that for all a* ∈ *B*(-<sup>2</sup>) *with D*(*a*) = 0*, there are l projections* (*e*1,..., *el*) *in* -<sup>∞</sup> *such that*

$$\sum\_{k=1}^{l} e\_k = 1\_H;\tag{4.68}$$

$$\|\|e\_i a e\_i\|\| \le \varepsilon \|a\|, \ i = 1, \ldots, l. \tag{4.69}$$

Now suppose that ω*<sup>d</sup>* ∈ *P*(-<sup>∞</sup>), that <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*(-<sup>2</sup>)) extends <sup>ω</sup>*d*, and that *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*(-<sup>2</sup>) has *D*(*a*) = 0. Let *ei* be one of the projections in Lemma 4.26. Using Cauchy–Schwarz for the sesquilinear form (*a*,*b*) = ω(*a*∗*b*), we obtain (using *e*<sup>2</sup> *<sup>i</sup>* = *e*<sup>∗</sup> *<sup>i</sup>* = *ei*)

$$|\mathfrak{o}(e\_i a e\_j)|^2 \le \mathfrak{o}(e\_i)\mathfrak{o}(e\_j a^\* a e\_l);\tag{4.70}$$

$$\left|\mathfrak{o}(e\_i a e\_j)\right|^2 \le \mathfrak{o}(a^\* e\_i a)\mathfrak{o}(e\_j). \tag{4.71}$$

Since ω(*ei*) = ω*d*(*ei*) and ω*<sup>d</sup>* is a pure state (and hence is multiplicative), we have ω(*ei*) ∈ {0,1}, since *ei* is a projection. Moreover, in view of (4.68) and the normalization ω(1*H*) = 1, there must be exactly one value of *i* = 1,...,*l*, say *i* = *i*0, such that ω(*ei*<sup>0</sup> ) = 1, and ω(*ei*) = 0 for all *i* = *i*0. Eqs. (4.70) - (4.71) therefore imply that ω(*eiae <sup>j</sup>*) = 0 iff *i* = *j* = *i*0. Using (4.68) once more, we see that ω(*a*) = ∑*i*, *<sup>j</sup>* ω(*eiaej*) = ω(*ei*<sup>0</sup> *aei*<sup>0</sup> ), so that |ω(*a*)|≤ω*ei*<sup>0</sup> *aei*<sup>0</sup> ≤ 1 · ε*a* by (4.66). Letting ε → 0, we proved:

Lemma 4.27. *If* ω ∈ *S*(*B*(-<sup>2</sup>)) *extends* <sup>ω</sup>*<sup>d</sup>* <sup>∈</sup> *<sup>P</sup>*(-<sup>∞</sup>)*, and D*(*a*) = 0*, then* ω(*a*) = 0*.* Since *<sup>D</sup>*<sup>2</sup> <sup>=</sup> *<sup>D</sup>*, we have *<sup>D</sup>*(*a*−*D*(*a*)) = 0, so that for any *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*(-<sup>2</sup>), we have

$$\mathfrak{o}(a) = \mathfrak{o}(D(a)) = \mathfrak{o}\_d(D(a)),\tag{4.72}$$

provided that ω extends ω*d*, as before. This shows that ω is determined by ω*<sup>d</sup>* and hence is unique, completing the proof (sketch) of Theorem 4.21.

#### 4.4 Gleason's Theorem in arbitrary dimension

To a large extent the thrust and difficulty of the proof of Gleason's Theorem 2.28 already lies in its finite-dimensional version, but some care is needed in the general case, and also Corollary 2.29 needs to be refined. A major point here is that Definition 2.23 has no unambiguous generalization to arbitrary Hilbert spaces.

Definition 4.28. *Let H be an arbitrary Hilbert space with unit sphere H*1*.*

*1. A* probability distribution *on* P(*H*) *is a map p* : *H*<sup>1</sup> → [0,1] *that satisfies*

$$\sum\_{i \in I} p(\mathfrak{v}\_i) = 1,\text{ for any basis } (\mathfrak{v}\_i) \text{ of } H,\tag{4.73}$$

*where, as in* §*B.12, the sum (over a possibly uncountable index set) is meant as in Definition B.6. In particular, if H is separable and the basis is labeled and ordered by I* = N*, then it is an ordinary convergent sum of the kind* ∑∞ *<sup>i</sup>*=<sup>1</sup> ···*. 2. A map P* : P(*H*) → [0,1] *that satisfies P*(1*H*) = 1 *is called a:*

#### *a.* finitely additive probability measure *if*

$$P\left(\sum\_{j\in J} e\_j\right) = \sum\_{j\in J} P(e\_j) \tag{4.74}$$

*for any* finite *collection* (*ej*)*j*∈*<sup>J</sup> of mutually orthogonal projections on H (i.e., ejH* ⊥ *ekH, or equivalently, e jek* = 0*, whenever j* = *k); this is equivalent to the condition P*(*e*+ *f*) = *P*(*e*) +*P*(*f*) *whenever e f* = 0*, cf. Definition 2.23.2.*


Thus a probability measure is by definition σ-additive in the usual sense of measure theory; the other two cases are unusual from that perspective. However, if *H* is separable, then *J* can be at most countable, so that complete additivity is the same as σ-additivity and hence any probability measure is completely additive. Surprisingly, assuming the *Continuum Hypothesis* (CH) of set theory, it can be shown that this is even the case for arbitrary Hilbert spaces. The fundamental distinction, then, is between *finitely* additive probability measures and probability measures (which by definition are *countably* additive). As we shall see, this reflects the distinction between *arbitrary* and *normal* states on *B*(*H*), respectively, cf. §4.2. In what follows, in dealing with non-separable Hilbert spaces we assume CH, in which case probability distributions on *H* are equivalent to probability measures on P(*H*).

The proof is the same as in finite dimension (taking into account that infinite sums over projections are defined strongly). Even without CH, Gleason's Theorem still holds for non-separable Hilbert spaces if we assume *P* to be completely additive, and probability distributions are equivalent to completely additive probability measures on P(*H*). For separable Hilbert spaces, CH is irrelevant and unnecessary altogether.

We then have the following generalization (and bifurcation) of Theorem 2.28.

Theorem 4.29. *Let H be a Hilbert space of dimension* > 2*.*

*1. Each probability measure P on* P(*H*) *is induced by a unique normal state on B*(*H*) *via* (2.122)*, i.e.,*

$$P(e) = \text{Tr}\left(\rho e\right),\tag{4.75}$$

*where* ρ *is a density operator on H uniquely determined by P. Equivalently, each probability distribution p on* P(*H*) *is given by* (2.123)*, or*

$$p(\mathfrak{v}) = \langle \mathfrak{v}, \mathfrak{p}\mathfrak{v} \rangle. \tag{4.76}$$

*Conversely, each density operator* ρ *on H defines a probability measure P on* P(*H*) *via* (4.75)*, as well as as a probability distribution p on* P(*H*) *via* (4.76)*.*

*2. Each finitely additive probability measure P on* P(*H*) *is induced by a unique state* ω *on B*(*H*) *via*

$$P(e) = \mathcal{O}(e),\tag{4.77}$$

*and similarly each probability distribution p on* P(*H*) *is given by*

$$p(\mathfrak{v}) = \mathfrak{o}(e\_{\mathfrak{v}}).\tag{4.78}$$

*Conversely, each state* ω *on H defines a probability measure P on* P(*H*) *via* (4.77)*, as well as as probability distribution p on* P(*H*) *via* (4.78)*.*

*Proof.* The proof of part 1 is practically the same as in finite dimension, except for the fact that in the proof of Lemma 2.33 the reference to Proposition A.23 should be replaced by Proposition B.79, upon which one obtains a bounded positive operator ρ for which (2.123) holds. The normalization condition (2.110) then yields Tr(ρ) = 1 if the trace is taken over any basis of *H*, and since ρ is positive this implies ρ ∈ *B*1(*H*), see §B.20 (complete additivity of *P* is just necessary to relate it to *p*).

Unfortunately, the proof of part 2 exceeds the scope of this book (see Notes). -

In infinite dimension, Corollary 2.29 becomes more complicated, too; for one thing, Definition 2.26 of a quasi-state bifurcates into two possibilities. The one given still makes perfect sense and is natural from the point of view of Bohrification; to avoid confusion we call a map ω : *B*(*H*) → C satisfying the conditions in Definition 2.26 a *strong quasi-state*. In the context of Gleason's Theorem, a slightly different notion is appropriate: a *weak quasi-state* on *B*(*H*) satisfies Definition 2.26, except that linearity is only required on commutative C\*-algebras in *B*(*H*) of the form *C*∗(*a*), where *a* = *a*<sup>∗</sup> ∈ *B*(*H*) (these are *singly generated*). Since commutative unital C\*-subalgebras of *B*(*H*) are not necessarily singly generated, and a specific counterexample exists, weak quasi-states are not necessarily strong quasi-states.

Proposition 4.30. *The map* ω → ω|P(*H*) *gives a bijective correspondence between weak quasi-states* ω *on B*(*H*) *and finitely additive probability measures on* P(*H*)*.*

*Proof.* For some finite family (*e*1,..., *en*) of mutually orthogonal projections on *H*, add *<sup>e</sup>*<sup>0</sup> <sup>=</sup> <sup>1</sup>*<sup>H</sup>* <sup>−</sup> <sup>∑</sup>*<sup>j</sup> ej* if necessary and let *<sup>a</sup>* <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>j</sup>*=<sup>0</sup> λ*jej*, with all λ*<sup>j</sup>* ∈ R different. Then <sup>σ</sup>(*a*) = {λ0,...,λ*n*}, so that *<sup>C</sup>*∗(*a*) <sup>∼</sup><sup>=</sup> *<sup>C</sup>*(σ(*a*) <sup>∼</sup><sup>=</sup> <sup>C</sup>*n*+<sup>1</sup> (cf. Theorem B.94) coincides with the linear span of the projections *e <sup>j</sup>*. If ω is a weak quasi-state, then it is linear on *C*∗(*a*) and hence also on the *ej*, so that ω|P(*H*) is finitely additive.

Conversely, let μ be a finitely additive probability measure on P(*H*). If *a* = *a*<sup>∗</sup> ∈ *B*(*H*) is given, using the notation (B.328) we symbolically define ω on *a* by

$$
\mathfrak{so}(a) = \int\_{\sigma(a)} d\mu(e\_{\lambda}) \,\lambda. \tag{4.79}
$$

More precisely, for any ε > 0 we use Corollary B.104 to define ωε (*a*) = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> λ*i*μ(*eAi* ) and let ω(*a*) = limε→<sup>0</sup>ωε (*a*); it follows from Lemma B.103 (or the theory underlying the Riemann–Stieltjes integral (4.79)) that this limit exists. Now let *b*, *c* ∈*C*∗(*a*), so that *b* = *f*(*a*) and *c* = *g*(*a*) for certain *f*,*g* ∈*C*(σ(*a*)), and *b*+*c* = (*f* +*g*)(*a*), cf. Theorem B.94. By (B.325) we therefore have ωε (*b* + *c*) = ∑*<sup>n</sup> <sup>i</sup>*=1(*f* + *g*)(λ*i*)μ(*eAi* ), which, since (*f* + *g*)(λ*i*) = *f*(λ*i*) + *g*(λ*i*), again by (B.325) equals ωε (*b*) + ωε (*c*). Since this holds for every ε > 0, letting ε → 0 we obtain ω(*b*+*c*) = ω(*b*) +ω(*c*), making ω linear on *C*∗(*a*). It is clear that the quasi-state ω thus obtained, on restriction to P(*H*) reproduces μ, making the map ω → ω|P(*H*) surjective. Finally, injectivity of this map follows from Corollary B.104. -.

Corollary 4.31. *If* dim(*H*) > 2*, then each weak quasi-state on B*(*H*) *(and* a fortiori *each strong quasi-state) is linear and hence is actually a state.*

This is immediate from Theorem 4.29.2. and Proposition 4.30.

Another corollary of Gleason's Theorem is the *Kochen–Specker Theorem*, which we will explain in detail in Chapter 6, where it will also be proved in a different way.

Theorem 4.32. *If* dim(*H*) > 2*, there are no weak quasi-states* ω : *B*(*H*) → C *whose restriction to each C\*-subalgebra C*∗(*a*) ⊂ *B*(*H*) *is pure (where a* = *a*<sup>∗</sup> ∈ *B*(*H*)*).*

Equivalently, there are no nonzero maps ω : *B*(*H*)sa → R that are:


Cf. Definitions 6.1 and 6.3. To see that these conditions are equivalent to those stated in Theorem 4.32 (despite the impression that linearity on all commuting self-adjoint operators seems stronger than linearity on each *C*∗(*a*)), extend ω to ω : *B*(*H*) → C by complex linearity, as in Definition 2.26.1, and note that dispersion-freeness implies positivity and hence continuity on each subalgebra*C*∗(*a*) (cf. Theorem C.52 and Lemma C.4). We then see that the two conditions just stated imply that ω is multiplicative on *C*∗(*a*), and hence pure, see Proposition C.14, which conversely implies that pure states on *C*∗(*a*) are dispersion-free. We now prove Theorem 4.32.

*Proof.* If *e* is a projection, then *e*<sup>2</sup> = *e*, so that ω(*e*2) = ω(*e*). Since ω is dispersionfree (as just explained), we also have ω(*e*2) = ω(*e*)2, whence ω(*e*)<sup>2</sup> = ω(*e*) and hence ω(*e*) ∈ {0,1}. Furthermore, since ω is a state by Corollary 4.31, we may apply the GNS-construction, see Theorem C.88 (whose notation we use). In particular, for any projection *e*, using the fact that πω(*e*) = πω(*e*)∗πω(*e*), by (C.196) we have

$$\mathfrak{so}(e) = \langle \mathfrak{Q}\_{\mathfrak{so}}, \pi\_{\mathfrak{op}}(e)\mathfrak{Q}\_{\mathfrak{op}}\rangle = \left||\pi\_{\mathfrak{op}}(e)\mathfrak{Q}\_{\mathfrak{op}}\right||^2. \tag{4.80}$$

If ω(*e*) = 0, then πω(*e*)Ωω = 0 from the second equality. If ω(*e*) = 1, then πω(*e*)Ωω = Ωω from the first inequality and Cauchy–Schwarz (in which we have equality, so that πω(*e*)Ωω = *z*Ωω for some *z* ∈ T, upon which (4.80) forces *z* = 1).

By the spectral theorem (e.g. in the form Corollary B.104) or the theory of von Neumann algebras, the linear span of P(*H*) is norm-dense in *B*(*H*). Since Ωω is cyclic for πω(*B*(*H*)) by the GNS-construction, it must be that *H*<sup>ω</sup> = C · Ωω, and hence πω(*a*) = ω(*a*)· 1*H*<sup>ω</sup> for any *a* ∈ *B*(*H*). Since πω(*ab*) = πω(*a*)πω(*b*) by the GNS-construction, this gives ω(*ab*) = ω(*a*)ω(*b*) for all *a*,*b* ∈ *B*(*H*). However, such multiplicative statesω on *B*(*H*) cannot exist if dim(*H*) > 1. This is clear if ω is normal, cf. Proposition 2.10, so that the following argument (which also covers the normal case) is especially meant for the case where ω is singular.

1. If dim(*H*) = *n* < ∞, there are *n* one-dimensional projections (*e*1,..., *en*) such that ∑*<sup>j</sup> ej* = 1*H*. (indeed, we may assume that *B*(*H*) = *Mn*(C) and take diagonal matrices *e*<sup>1</sup> = diag(1,0,...,0), etc.). Now for any pair (*ei*, *e <sup>j</sup>*) there is some *v* ∈ *B*(*H*) (which by definition is a partial isometry) such that *ei* = *vv*∗, *ej* = *v*∗*v* (in the above case *ei* and *ej* are thus related if *vi j* = 1 and *vi <sup>j</sup>* = 0 otherwise). Hence

$$\mathfrak{o}(e\_l) = \mathfrak{o}(\nu \boldsymbol{\nu}^\*) = \mathfrak{o}(\nu)\mathfrak{o}(\boldsymbol{\nu}^\*) = \mathfrak{o}(\boldsymbol{\nu}^\* \boldsymbol{\nu}) = \mathfrak{o}(e\_j), \tag{4.81}$$

since ω is multiplicative. But ω is also additive, which implies

$$\sum\_{j=1}^{n} \mathfrak{o}(e\_i) = \mathfrak{o}\left(\sum\_{j=1}^{n} e\_j\right) = \mathfrak{o}(1\_H) = 1. \tag{4.82}$$

Since also ω(*ei*) ∈ {0,1}, eqs. (4.81) - (4.82) are clearly contradictory.

2. If dim(*H*) = ∞, separable or not, a similar contradiction arises from the *halving lemma*, which states that there is a projection *e* and an operator *v* such that *e* = *vv*∗, 1*<sup>H</sup>* −*e* = *v*∗*v*. For example, in the separable case assume *H* = -<sup>2</sup> and take *e* the projection onto the closed linear span -2 *<sup>e</sup>* of the basis vectors (δ*x*) with *x* ∈ N even, so that 1*<sup>H</sup>* − *e* projects onto the closed linear span -2 *<sup>o</sup>* of the basis vectors (δ*x*) with *x* ∈ N odd. Then -<sup>2</sup> = -2 *<sup>e</sup>* ⊕ -2 *<sup>o</sup>*; take *v* = 0 on -2 *<sup>e</sup>* and *v* : -2 *<sup>o</sup>* → -2 *<sup>e</sup>* any unitary operator. In general, a similar method works, for if *I* is a set indexing some basis of *H* one may find a subset *E* ⊂ *I* that has the same cardinality as its complement *I*\*E*, upon which -<sup>2</sup>(*E*) ∼= -<sup>2</sup>(*I*\*E*), cf. Theorem B.63.

Multiplicativity of ω then leads to similar contradiction between the properties ω(*e*) = ω(1*<sup>H</sup>* −*e*), as in (4.81), and ω(*e*)+ω(1*<sup>H</sup>* −*e*) = ω(1*H*) = 1, as in (4.82): if ω(*e*) = 0 one finds 0 = 1, whereas ω(*e*) = 1 implies 2 = 1. -

#### Notes

## §4.1. The Born rule from Bohrification (II)

The Born measure (and its construction along the lines of this section) is well known in functional analysis, cf. Pedersen (1989), §4.5. For the Hamburger Moment Problem see, for example, Reed, M. & Simon, B. (1975), *Methods of Modern Mathematical Physics. Vol II. Fourier Analysis, Self-adjointness* (New York: Academic Press), Theorem X.4, p. 145 and Example 4, p. 205. In fact, the proof uses spectral theory! Corollary 4.6 was suggested by the treatment of the Born rule in Hall (2013). Definition 4.9 of the joint spectrum goes back (at least) to Arens (1961) and Hormander (1966), ¨ §3.1.13.

## §4.2. Density operators and normal states

These are really results about von Neumann algebras and come from the pertinent literature; our proofs derive from Li (1992), §1.8 and Takesaki (2002), Ch. III.

## §4.3. The Kadison–Singer Conjecture

As already mentioned in the notes to §2.6, the Kadison–Singer Conjecture was first discussed in Kadison & Singer (1959) and was finally proved by Marcus, Spielman, & Srivastava (2014ab), following important intermediate contributions by e.g. Anderson (1979) and Weaver (2004). For an introduction including a complete proof see Stevens (2016), and for applications of the conjecture and its proof to other areas of mathematics see Casazza et al (2005) as well as Casazza & Tremain (2016). Proposition 4.20 is due to Glimm (1960).

## §4.4. Gleason's Theorem in arbitrary dimension

The extension of Gleason's Theorem to non-separable Hilbert space assuming complete additivity of *P* is due to Maeda (1980). Maeda (1990) generalizes this result to von Neumann algebras without summands of type *I*2. The proof that assuming CH countable additivity implies complete additivity (and hence Gleason's Theorem) was given by Eilers & Horst (1975). Proposition 4.30 is due to Aarens (1970), whose Theorem 1 is wrong: see Aarens (1991). The proof of Theorem 4.32 is due to Doring (2004), using results of Hamhalter (1993). ¨

## Chapter 5 Symmetry in quantum mechanics

Roughly speaking, a *symmetry* of some mathematical object is an invertible transformation that leaves all relevant structure as it is. Thus a symmetry of a set is just a bijection (as sets have no further structure, whence invertibility is the only demand on a symmetry), a symmetry of a topological space is a homeomorphism, a symmetry of a Banach space is a linear isometric isomorphism, and, crucially important for this chapter, a symmetry of a Hilbert space *H* is a *unitary operator*, i.e., a linear map *u* : *H* → *H* satisfying one and hence all of the following equivalent conditions:


The discussion of symmetries in quantum physics is based on the above idea, but the mathematically obvious choices need not be the physically relevant ones. Even in elementary quantum mechanics, where *A* = *B*(*H*), i.e., the C\*-algebra of all bounded operators on some Hilbert space *H*, the concept of a symmetry is already diverse. The main structures whose symmetries we shall study in this chapter are:


Each of these structures comes with its own notion of a symmetry, but the main point of this chapter will be to show these notions are equivalent, corresponding in all cases to either unitary or—surprisingly—*anti-unitary* operators, both merely defined up to a phase. The latter subtlety will open the world of *projective* unitary group representation to quantum mechanics (without which the existence of spin- <sup>1</sup> 2 particles such as electrons, and therewith also of ourselves, would be impossible).

#### 5.1 Six basic mathematical structures of quantum mechanics

We first recall the objects just described in a bit more detail. We have:

$$\mathcal{O}^{\theta} \_1(H) = \{ e \in B(H) \mid e^2 = e^\* = e, \text{Tr}\,(e) = \text{dim}(eH) = 1\};\tag{5.1}$$

$$\mathcal{O}(H) = \{ \mathfrak{p} \in \mathcal{B}(H) \mid \mathfrak{p} \ge 0, \text{Tr}(\mathfrak{p}) = 1 \};\tag{5.2}$$

$$B(H)\_{\rm sa} = \{ a \in B(H) \mid a^\* = a \};\tag{5.3}$$

$$\mathcal{S}^{\mathbb{C}}(H) = \{ a \in \mathcal{B}(H) \mid 0 \le a \le 1\_H \};\tag{5.4}$$

$$\mathcal{O}^{\emptyset}(H) = \{ e \in \mathcal{B}(H) \mid e^2 = e^\* = e \};\tag{5.5}$$

$$\mathcal{NC}(B(H)) = \{ \mathbf{C} \subset B(H) \mid \mathbf{C} \text{ commutative } \mathbf{C}^\*\text{-algebra}, 1\_H \in \mathbf{C} \}. \tag{5.6}$$

The point is that each of these sets has some additional structure that defines what it means to be a symmetry of it, as we now spell out in detail.

Definition 5.1. *Let H be a Hilbert space (not necessarily finite-dimensional).*

*1. A*Wigner symmetry *(of H) is a bijection*

$$\mathcal{W} \colon \mathcal{P}\_1(H) \to \mathcal{P}\_1(H) \tag{5.7}$$

*that satisfies*

$$\operatorname{Tr}(\mathsf{W}(e)\mathsf{W}(f)) = \operatorname{Tr}(ef),\ e,f \in \mathcal{P}\_1(H). \tag{5.8}$$

*2. A* Kadison symmetry *is an* affine *bijection*

$$
\mathbb{K}: \mathcal{Q}(H) \to \mathcal{Q}(H),
\tag{5.9}
$$

*i.e. a bijection* K *that preserves convex sums: for t* ∈ (0,1) *and* ρ1,ρ<sup>2</sup> ∈ D(*H*)*,*

$$
\mathsf{K}(t\mathfrak{p}\_1 + (1-t)\mathfrak{p}\_2) = t\mathsf{K}\mathfrak{p}\_1 + (1-t)\mathsf{K}\mathfrak{p}\_2.\tag{5.10}
$$

#### *3. a. A* Jordan symmetry *is an invertible* Jordan map

$$\downarrow : B(H)\_{\text{sa}} \to B(H)\_{\text{sa}},\tag{5.11}$$

*i.e., an* R*-linear bijection that satisfies the equivalent conditions*

$$\mathbb{J}(a \circ b) = \mathbb{J}(a) \circ \mathbb{J}(b);\tag{5.12}$$

$$\mathbb{J}(a^2) = \mathbb{J}(a)^2. \tag{5.13}$$

*Here*

$$a \circ b = \frac{1}{2}(ab + ba) \tag{5.14}$$

*is the* Jordan product *on B*(*H*)sa*, which turns the (real) vector space B*(*H*)sa *into a* Jordan algebra*, cf.* §*C.25.*

*b. A* weak Jordan symmetry *is an invertible* weak Jordan map*, i.e., a bijection* (5.11) *of which the restriction* J|*C*sa *is a Jordan map for each C* ∈ C (*B*(*H*))*.*

#### *4. A* Ludwig symmetry *is an affine order isomorphism*

$$
\mathsf{L}: \mathscr{E}(H) \to \mathscr{E}(H). \tag{5.15}
$$

*5. A* von Neumann symmetry *is an order isomorphism*

$$\mathbb{N}: \mathcal{P}(H) \to \mathcal{P}(H) \tag{5.16}$$

*preserving orthocomplementation, i.e.* N(1−*e*) = 1−N(*e*) *for each e* ∈ P(*H*)*. 6. A* Bohr symmetry *is an order isomorphism*

$$\mathcal{B}: \mathcal{C}(B(H)) \to \mathcal{C}(B(H)).\tag{5.17}$$

In nos. 3 and 5–6, an *order isomorphism* O of the given poset is a bijection that preserves the partial order ≤ (i.e., if *x* ≤ *y*, then O(*x*) ≤ O(*y*)) and whose inverse <sup>O</sup>−<sup>1</sup> does so, too; cf. §D.1. The names in question have been chosen for historical reasons and (except perhaps for the first and third) are not standard.

Let us note that any Jordan map has a unique extension to a C-linear map

$$\mathbb{J}\_{\mathbb{C}}: \mathcal{B}(H) \to \mathcal{B}(H);\tag{5.18}$$

$$\mathsf{J}\_{\mathbb{C}}(a^\*) = \mathsf{J}\_{\mathbb{C}}(a)^\*,\tag{5.19}$$

which satisfies (5.12) for all *a*,*b*, as well as

$$\mathbb{J}\_{\mathbb{C}}(a+ib) = \mathbb{J}(a) + i\mathbb{J}(b),\tag{5.20}$$

with notation as in Proposition 2.6. Conversely, such a Jordan map (5.18) defines a real Jordan map (5.11) by J = J|*B*(*H*)sa . Similarly, a weak Jordan symmetry is equivalent to a map (5.18) that satisfies (5.19), preserves squares as in (5.13), and is linear on each subspace *C* of *B*(*H*), with *C* ∈ C (*B*(*H*)). In other words (in the spirit of Bohrification), J<sup>C</sup> is a homomorphism of C\*-algebras on each commutative unital C\*-subalgebra *C* ⊂ *B*(*H*). Therefore, either way J and J<sup>C</sup> are essentially the same thing, and if no confusion may arise we call it J. Note that a weak Jordan map J *a priori* satisfies (5.12) only for *commuting* self-adjoint *a* and *b*. It follows that weak (and hence ordinary) Jordan symmetries are unital: since

$$\mathbb{J}(b) = \mathbb{J}(1\_H \circ b) = \mathbb{J}(1\_H) \circ \mathbb{J}(b) \tag{5.21}$$

for any *b*, we may pick *b* = J−1(1*H*) to find, reading (5.21) from right to left,

$$\mathbb{J}(1\_H) = \mathbb{J}(1\_H) \diamond 1\_H = 1\_H. \tag{5.22}$$

The special role of unitary operators *u* now emerges: each such operator defines the relevant symmetry in the obvious way, namely, in order of appearance:

$$\mathcal{W}(e) = ueu^\*;\tag{5.23}$$

$$\mathsf{K}(\mathfrak{p}) = \mathsf{u}\mathfrak{p}u^\*;\tag{5.24}$$

$$
\mathsf{L}(a) = \mathsf{u}au^\*; \tag{5.25}
$$

$$\mathbf{J}(a) = \mu a u^\*;\tag{5.26}$$

$$\mathsf{N}(e) = \mathsf{u}e\mathsf{u}^\*;\tag{5.27}$$

$$\mathbb{B}(C) = \mathfrak{u}C\mathfrak{u}^\*,\tag{5.28}$$

where *a* <sup>∗</sup> = *a* in (5.26). If not, this formula remains valid also for the map *J*C. Furthermore, in (5.28) the notation *uCu*<sup>∗</sup> is shorthand for the set {*uau*<sup>∗</sup> | *a* ∈ *C*}, which is easily seen to be a member of C (*B*(*H*)). Here, as well as in the other three cases, it is easy to verify that the right-hand side belongs to the required set, that is,

$$
u 
u^\* \in \mathcal{P}\_1(H), \ u \mathfrak{p} \mu^\* \in \mathcal{O}(H), \ u \mathfrak{p} \mu^\* \in \mathcal{E}(H),\tag{5.29}$$

$$
\mu u u^\* \in B(H)\_{\text{sa}}, \ u \rho u^\* \in \mathcal{O}'(H), \ u Cu^\* \in \mathcal{O}'(B(H)), \tag{5.30}
$$

respectively, provided, of course, that

$$a \in \mathcal{P}\_1(H), \ p \in \mathcal{O}(H), \ a \in \mathcal{E}(H) \ a \in \mathcal{B}(H)\_{\mathsf{sa}}, \ e \in \mathcal{P}(H), \ \mathcal{C} \in \mathcal{C}(\mathcal{B}(H)).$$

Indeed, if, in (5.23), *e* = *e*<sup>ψ</sup> = |ψihψ| for some unit vector ψ ∈ *H*, then

$$
\mu e\_{\Psi} \mu^\* = e\_{\mu \Psi}.\tag{5.31}
$$

If ρ ≥ 0 in that hψ,ρψi ≥ 0 for each ψ ∈ *H*, then clearly also *u*ρ*u* <sup>∗</sup> ≥ 0, and if Tr(ρ) = 1, then also Tr(*u*ρ*u* ∗ ) = 1. If *a* <sup>∗</sup> = *a*, then

$$(\mu au^\*)^\* = u^{\*\*}a^\*u^\* = \mu au^\*.\tag{5.32}$$

However, one may also choose *u* in these formulae to be *anti-unitary*, as follows:

Definition 5.2. *1. A real-linear operator u* : *H* → *H is* anti-linear *if*

$$
\mu(z\Psi) = \overline{z}\Psi \text{ (}z \in \mathbb{C}\text{)}.\tag{5.33}
$$

*2. An anti-linear operator u* : *H* → *H is* anti-unitary *if it is invertible, and*

$$
\langle \mu \mathfrak{p}, \mu \mathfrak{v} \rangle = \overline{\langle \mathfrak{p}, \mathfrak{v} \rangle}^{\cdot} (\mathfrak{p}, \mathfrak{v} \in H). \tag{5.34}
$$

*The adjoint u*<sup>∗</sup> *of a (bounded) anti-linear operator u is defined by the property*

$$<\langle \mu^\*\boldsymbol{\varphi}, \boldsymbol{\Psi} \rangle = \overline{\langle \boldsymbol{\Phi}, \boldsymbol{\mu}\boldsymbol{\Psi} \rangle} \ (\boldsymbol{\Phi}, \boldsymbol{\Psi} \in H), \tag{5.35}$$

in which case *u* ∗ is anti-linear, too. Hence we may equally well say that an anti-linear operator is anti-unitary if *uu*<sup>∗</sup> = *u* <sup>∗</sup>*u* = 1*H*. The simplest example is the map

5.1 Six basic mathematical structures of quantum mechanics 129

$$J: \mathbb{C}^n \to \mathbb{C}^n;$$

$$Jz = \overline{z},\tag{5.36}$$

i.e., if *<sup>z</sup>* = (*z*1,...,*zn*) <sup>∈</sup> <sup>C</sup>*n*, then (*Jz*)*<sup>i</sup>* <sup>=</sup> *zi*. Similarly, one may define

$$J: \ell^2 \to \ell^2;$$

$$J\Psi = \overline{\Psi},\tag{5.37}$$

and likewise on *L*2, where complex conjugation is defined pointwise, that is,

$$(J\Psi)(\mathbf{x}) = \overline{\Psi(\mathbf{x})}.\tag{5.38}$$

For any Hilbert space one may pick a basis (υ*i*) and define *J* relative to this basis by

$$J\left(\sum\_{i} c\_{i}\mathbf{u}\_{i}\right) = \sum\_{i} \overline{c}\_{i}\mathbf{u}\_{i}.\tag{5.39}$$

For future use, we state two obvious facts.

Proposition 5.3. *1. The product of two anti-unitary operators is unitary.*

*2. Any anti-unitary operator u* : *H* → *H takes the form u* = *Jv, where v is unitary and J is an anti-unitary operator on H of the kind constructed above.*

It is an easy verification that (5.23) - (5.28) still define symmetries if *u* is antiunitary. Note that in terms of the complexification JC, eq. (5.26) should read

$$\mathbb{J}\_{\mathbb{C}}(a) = \mu a^\* \mu^\*. \tag{5.40}$$

The goal of the following sections is to show that these are the only possibilities:

Theorem 5.4. *Let H be a Hilbert space, with* dim(*H*) > 1*.*


*where in all cases the operator u is either unitary or anti-unitary, and is uniquely determined by the symmetry in question up to a phase (that is, u and u implement the same symmetry by conjugation iff u* = *zu, where z* ∈ T*).*

As we shall see, the reason why the case *H* = C<sup>2</sup> is exceptional with regard to weak Jordan symmetries, von Neumann symmetries, and Bohr symmetries is that in those cases the proof relies on Gleason's Theorem, which fails for *H* = C2.

To see this more explicitly, and also to prove the positive cases (i.e., nos. 1–4a) in a simple situation without invoking higher principles, before proving Theorem 5.4 in general it is instructive to first illustrate it in the two-dimensional case *H* = C2.

## 5.2 The case *H* = C<sup>2</sup>

We start with some background. Any complex 2×2 matrix *a* can be written as

$$a = a(\mathbf{x}\_0, \mathbf{x}\_1, \mathbf{x}\_2, \mathbf{x}\_3) = \frac{1}{2} \sum\_{\mu=0}^{3} \mathbf{x}\_{\mu} \sigma\_{\mu} \ (\mathbf{x}\_{\mu} \in \mathbb{C});\tag{5.41}$$

$$
\sigma\_0 = \begin{pmatrix} 1 \ 0 \\ 0 \ 1 \end{pmatrix}, \sigma\_1 = \begin{pmatrix} 0 \ 1 \\ 1 \ 0 \end{pmatrix}, \sigma\_2 = \begin{pmatrix} 0 \ -i \\ i \ 0 \end{pmatrix}, \sigma\_3 = \begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}, \qquad (5.42)
$$

i.e., the *Pauli matrices*. Furthermore, if we equip the vector space *M*2(C) of complex 2 × 2 matrices with the canonical inner product (2.34), then the rescaled matrices σ <sup>μ</sup> = σμ / √ 2 form a basis (≡ orthonormal basis) of the ensuing Hilbert space.

Writing x = (*x*1, *x*2, *x*3), some interesting special cases are:


The first case follows because *SU*(2) consist of all matrices of the form

$$\left(\begin{array}{c} \alpha \quad \beta\\ -\overline{\beta} \ \overline{\alpha} \end{array}\right), \ \alpha, \beta \in \mathbb{C}, \ |\alpha|^2 + |\beta|^2 = 1. \tag{5.43}$$

The second case is obvious, and the third follows from Proposition 2.9.

Assume the third case, so that *a* = *e* with *e*<sup>2</sup> = *e*<sup>∗</sup> = *e* and Tr(*e*) = 1. If a linear map *<sup>u</sup>* : <sup>C</sup><sup>2</sup> <sup>→</sup> <sup>C</sup><sup>2</sup> is unitary, then simple computations show that *<sup>e</sup>* <sup>=</sup> *ueu*<sup>∗</sup> is a onedimensional projection, too, given by *e* = <sup>1</sup> <sup>2</sup> ∑<sup>3</sup> <sup>μ</sup>=<sup>0</sup> *x* μσμ with *x* <sup>0</sup> <sup>=</sup> 1, <sup>x</sup> <sup>∈</sup> <sup>R</sup>3, and x <sup>=</sup> 1. Writing <sup>x</sup> <sup>=</sup> *<sup>R</sup>*<sup>x</sup> for some map *<sup>R</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *<sup>S</sup>*2, we have

$$
\mu(\mathbf{x} \cdot \boldsymbol{\sigma})\mu^\* = (R\mathbf{x}) \cdot \boldsymbol{\sigma}, \tag{5.44}
$$

where <sup>x</sup> · <sup>σ</sup> <sup>=</sup> <sup>∑</sup><sup>3</sup> *<sup>j</sup>*=<sup>1</sup> *xj*σ*j*. This also shows that *R* extends to a linear isometry *R* : <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>R</sup>3. Using the formula Tr(σ*i*σ*j*) = <sup>2</sup>δ*i j*, the matrix-form of *<sup>R</sup>* follows as

$$\mathcal{R}\_{ij} = \frac{1}{2} \text{Tr} \left( \mu \sigma\_l \mu^\* \sigma\_j \right). \tag{5.45}$$

Define *U*(2) as the (connected) group of all unitary 2×2 matrices (whose connected subgroup *SU*(2) of elements with unit determinant has just been mentioned). Also, recall that *O*(3) is the group of all real orthogonal 3×3 matrices *M*, a condition that may be expressed in (at least) four equivalent ways (like unitarity):

$$\bullet \quad MM^T = M^M M = 1\_3;$$


This implies det(*M*) = ±1 (as can be seen by diagonalizing *M*; being a real linear isometry, its eigenvalues can only be ±1, and det(*M*) is their product). Thus *O*(3) breaks up into two parts *O*±(3) = {*R* ∈ *O*(3) | det(*R*) = ±1}, of which *O*<sup>+</sup> ≡ *SO*(3) consists of rotations. Using an explicit parametrization of *SO*(3), e.g., through Euler angles, or, using surjectivity of the exponential map (from the Lie algebra of *SO*(3), which consist of anti-symmetric real matrices), it follows that *O*±(3) are precisely the two connected components of *O*(3), the identity of course lying in *O*+(3).

Proposition 5.5. *The map u* → *R defined by* (5.44) *is a homomorphism from U*(2) *onto SO*(3)*. In terms of SU*(2) ⊂ *U*(2)*, this map restricts to a two-fold covering*

$$\mathfrak{A}: SU(2) \to SO(3),\tag{5.46}$$

*with discrete kernel*

$$\ker(\tilde{\mathfrak{x}}) = \{1\_2, -1\_2\}.\tag{5.47}$$

*Proof.* As a finite-dimensional linear isometry, *R* is automatically invertible (this also follows from unitarity and hence invertibility of *u*), hence *R* ∈ *O*(3). It is obvious from (5.44) that *u* → *R* is a continuous homomorphism (of groups). Since *U*(2) is connected and *u* → *R* is continuous, *R* must lie in the connected component of *O*(3) containing the identity, whence *R* ∈ *SO*(3). To show surjectivity of π˜, take some unit vector <sup>u</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> and define *<sup>u</sup>* <sup>=</sup> cos( <sup>1</sup> <sup>2</sup>θ)+*i*sin( <sup>1</sup> <sup>2</sup>θ)u·σ. The corresponding rotation *R*<sup>θ</sup> (u) is the one around u by an angle θ, and such rotations generate *SO*(3).

Finally, it follows from (5.44) that *u* ∈ ker(π˜) iff *u* commutes with each σ*<sup>i</sup>* and hence, by (5.41), with all matrices. Therefore, *u* = *z*·12 for some *z* ∈ C, upon which the the condition det(*u*) = 1 (in that *u* ∈ *SU*(2)) enforces *z* = ±1. -

Note that the covering (5.46) is topologically nontrivial (i.e., *SU*(2) = *SO*(3)×Z2), since *SU*(2) ∼= *S*<sup>3</sup> is simply connected, whereas *SO*(3) is doubly connected: a closed path *t* → *R*2π*t*(u), *t* ∈ [0,1] in *SO*(3) (starting and ending at 13) lifts to a path

$$t \mapsto \cos(\pi t) + i \sin(\pi t) \mathbf{u} \cdot \boldsymbol{\sigma}$$

in *SU*(2) that starts at the unit matrix 12 and ends at −12.

To incorporate *O*−(3), let *Ua*(2) be the set of all anti-unitary 2 × 2 matrices. These do not form a group, as the product of two anti-unitaries is unitary, but the union *U*(2)∪*Ua*(2) is a disconnected Lie group with identity component *U*(2).

Proposition 5.6. *The map u* → *R defined by* (5.44) *is a surjective homomorphism*

$$\mathfrak{A}': U(2) \cup U\_a(2) \to O(3), \tag{5.48}$$

*with kernel U*(1)*, seen as the diagonal matrices z*· 12*, z* ∈ T*. Moreover,* π˜ *maps U*(2) *onto SO*(3) *and maps Ua*(2) *onto O*−(3)*.*

*Proof.* The map *<sup>u</sup>* → *<sup>R</sup>* in (5.44) sends the anti-unitary operator *<sup>u</sup>* <sup>=</sup> *<sup>J</sup>* on <sup>C</sup><sup>2</sup> to *R* = diag(1,−1,1) ∈ *O*−(3). Since*Ua*(2) = *J* ·*U*(2) and similarly *O*−(3) = *R*·*SO*(3), the last claim follows. The computation of the kernel may now be restricted to *U*(2), and then follows as in the last step op the proof of the previous proposition. - We now return to Theorem 5.4 and go through its special cases one by one.

Part 1 of Theorem 5.4 is *Wigner's Theorem*, which in the case at hands reads:

Theorem 5.7. *Each bijection* <sup>W</sup> : <sup>P</sup>1(C2) <sup>→</sup> <sup>P</sup>1(C2) *that satisfies*

$$\operatorname{Tr}\left(\mathsf{W}(e)\mathsf{W}(f)\right) = \operatorname{Tr}\left(ef\right) \tag{5.49}$$

*for each e*, *<sup>f</sup>* <sup>∈</sup> <sup>P</sup>1(C2) *takes the form* <sup>W</sup>(*e*) = *ueu*∗*, where u is either unitary or anti-unitary, and is uniquely determined by* W *up to a phase.*

To prove, this we transfer the whole situation to the two-sphere, where it is easy:

Proposition 5.8. *The pure state space* P1(C2) *corresponds bijectively to the sphere*

$$S^2 = \{(\mathbf{x}, \mathbf{y}, z) \in \mathbb{R}^3 \mid \mathbf{x}^2 + \mathbf{y}^2 + z^2 = 1\},$$

*in that each one-dimensional projection e* <sup>∈</sup> <sup>P</sup>1(C2) *may be expressed uniquely as*

$$e(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{2} \begin{pmatrix} 1+z \ \mathbf{x} - i\mathbf{y} \\ \mathbf{x} + i\mathbf{y} \ 1-z \end{pmatrix},\tag{5.50}$$

*where* (*x*, *<sup>y</sup>*,*z*) <sup>∈</sup> <sup>R</sup><sup>3</sup> *and x*<sup>2</sup> <sup>+</sup>*y*<sup>2</sup> <sup>+</sup>*z*<sup>2</sup> <sup>=</sup> <sup>1</sup>*. Under the ensuing bijection*

$$\mathcal{P}\_1(\mathbb{C}^2) \cong \mathbb{S}^2,\tag{5.51}$$

*Wigner symmetries* <sup>W</sup> *of* <sup>C</sup><sup>2</sup> *turn into orthogonal maps R* <sup>∈</sup> *<sup>O</sup>*(3)*, restricted to S*2*.*

*Proof.* The first claim restates Proposition 2.9. If ψ and ψ are unit vectors in C<sup>2</sup> with corresponding one-dimensional projections *e*ψ(*x*, *y*,*z*) and *e*<sup>ψ</sup>(*x* , *y* ,*z* ) then, as one easily verifies, the corresponding transition probability takes the form

$$\operatorname{Tr}(e\_{\Psi}e\_{\Psi'}) = \frac{1}{2}(1 + \langle \mathbf{x}, \mathbf{x'} \rangle) = \cos^2(\frac{1}{2}\theta(\mathbf{x}, \mathbf{y})),\tag{5.52}$$

where θ(x,y) is the arc (i.e., geodesic) distance between x and y. Consequently, if <sup>W</sup> : <sup>P</sup>1(C2) <sup>→</sup> <sup>P</sup>1(C2) satisfies (5.8), then the corresponding map *<sup>R</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *<sup>S</sup>*<sup>2</sup> (defined through the above identification P1(C2) ∼= *S*2) satisfies

$$
\langle \mathcal{R}(\mathbf{x}), \mathcal{R}(\mathbf{x'}) \rangle = \langle \mathbf{x}, \mathbf{x'} \rangle \text{ (}\mathbf{x}, \mathbf{x'} \in \mathbb{S}^2\text{)}.\tag{5.53}
$$

Lemma 5.9. *If some bijection R* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *<sup>S</sup>*<sup>2</sup> *satisfies* (5.53)*, then R extends (uniquely) to an orthogonal linear map (for simplicity also called) R* : <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>R</sup>3*.*

*Proof.* With (u1,u2,u3) the standard basis of <sup>R</sup>3, define a 3×3 matrix by

$$R\_{kl} = \langle \mathbf{u}\_k, R(\mathbf{u}\_l) \rangle. \tag{5.54}$$

It follows from (5.53) that *<sup>R</sup>*−1(u*j*)*<sup>k</sup>* <sup>=</sup> *Rjk*, which implies *R*−1(u*j*),x <sup>=</sup> <sup>∑</sup>*<sup>k</sup> Rjkxk*, or, once again using (5.53), *R*(x)*<sup>j</sup>* = ∑*<sup>k</sup> Rjkxk*. Hence the map x → ∑*j*,*<sup>k</sup> Rjkxk*u*j*, i.e., the usual linear map defined by the matrix (5.54), extends the given bijection *R*. Orthogonality of this linear map is, of course, equivalent to (5.53). -

Wigner's Theorem then follows by combining Propositions 5.6 and 5.8: given the linear map *R* just constructed, read (5.44) from right to left, where *u* exists by surjectivity of the map (5.48), and the precise lack of uniqueness of *u* as claimed in Theorem 5.4 is just a restatement of the fact that (5.48) has *U*(1) as its kernel. -

*Kadison's Theorem* is part 2 of Theorem 5.4. Explicitly, for *H* = C<sup>2</sup> we have:

Theorem 5.10. *Each affine bijection* <sup>K</sup> : <sup>D</sup>(C2) <sup>→</sup> <sup>D</sup>(C2) *is given as* <sup>K</sup>(ρ) = *<sup>u</sup>*ρ*u*∗*, where u is unitary or anti-unitary, and is uniquely determined by* K *up to a phase.*

*Proof.* We once again invoke Proposition 2.9, implying that any density matrix ρ on C<sup>2</sup> takes the form

$$\mathcal{P} = \frac{1}{2} \left( 1\_2 + \sum\_{\mu=1}^{3} \mathbf{x}\_{\mu} \sigma\_{\mu} \right), \tag{5.55}$$

with x ≤ 1. Moreover, the ensuing bijection <sup>D</sup>(C2) <sup>∼</sup><sup>=</sup> *<sup>B</sup>*3, <sup>ρ</sup> → <sup>x</sup>, is clearly affine, in that a convex sums *t*ρ + (1−*t*)ρ of density matrices correspond to convex sums *<sup>t</sup>*x+ (1−*t*)<sup>x</sup> of the corresponding vectors in <sup>R</sup>3.

Lemma 5.11. *Any affine bijection* K *of the unit ball B*<sup>3</sup> *in* R<sup>3</sup> *is given by an orthogonal linear map R* ∈ *O*(3)*.*

*Proof.* First, K must map the boundary ∂*eB*<sup>3</sup> = *S*<sup>2</sup> to itself (necessarily bijectively): if <sup>x</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> and <sup>K</sup>(x) = *<sup>t</sup>*<sup>x</sup> + (1−*t*)<sup>x</sup>, then <sup>x</sup> <sup>=</sup> *tK*−1(<sup>x</sup> )+(1−*t*)*K*−1(<sup>x</sup>), whence

$$K^{-1}(\mathbf{x}') = K^{-1}(\mathbf{x}''),\tag{5.56}$$

since x is pure, whence x = x, so that also K(x) is pure.

Second, the basis of all further steps is the property

$$\mathbb{K}(\mathbf{0}) = \mathbf{0}.\tag{5.57}$$

This is because 0 is intrinsic to the convex structure of *B*3: it is the unique point with the property that for any <sup>x</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> there exists a unique <sup>x</sup> such that <sup>1</sup> <sup>2</sup> x+ <sup>1</sup> <sup>2</sup> x = 0, namely x = −x. Thus 0 must be preserved under affine bijections. For a formal proof (by contradiction), suppose <sup>K</sup>(0) <sup>=</sup> <sup>0</sup>, and define <sup>y</sup> <sup>=</sup> <sup>K</sup>(0)/K(0) ∈ *<sup>S</sup>*2. Then K(0) has an extremal decomposition K(0) = *t*y + (1 −*t*)y , with <sup>y</sup> <sup>=</sup> <sup>−</sup><sup>y</sup> and *<sup>t</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> (1+K(0)). Applying the affine map <sup>K</sup>−<sup>1</sup> then gives

$$\|\|\mathsf{K}^{-1}(\mathbf{y}')\|\| = \|\mathsf{K}^{-1}(\mathbf{y})\| \cdot \frac{1 + \|\mathsf{K}(\mathbf{0})\|}{1 - \|\mathsf{K}(\mathbf{0})\|}.$$

Now <sup>y</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> and hence <sup>K</sup>−1(y) <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> by part one of this proof (applied to <sup>K</sup>−1), so that K−1(y) <sup>=</sup> 1. But this implies K−1(<sup>y</sup> ) > 1, which is impossible because <sup>y</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> and hence K−1(<sup>y</sup> ) = 1.

Third, for <sup>x</sup> <sup>∈</sup> *<sup>B</sup>*<sup>3</sup> and *<sup>t</sup>* <sup>∈</sup> [0,1] the preceding point implies that

$$\mathsf{K}(t\mathbf{x}) = \mathsf{K}(t\mathbf{x} + (1 - t)\mathbf{0}) = t\mathsf{K}(\mathbf{x}) + (1 - t)\mathsf{K}(\mathbf{0}) = t\mathsf{K}(\mathbf{x}).\tag{5.58}$$

The same then holds for <sup>x</sup> <sup>∈</sup> *<sup>B</sup>*<sup>3</sup> and all *<sup>t</sup>* <sup>≥</sup> 0 as long as *<sup>t</sup>*<sup>x</sup> <sup>∈</sup> *<sup>B</sup>*3: for take *<sup>t</sup>* <sup>&</sup>gt; 1, so that *t* <sup>−</sup><sup>1</sup> <sup>∈</sup> (0,1), and use the previous step with <sup>x</sup> *<sup>t</sup>*<sup>x</sup> and *<sup>t</sup> <sup>t</sup>* <sup>−</sup><sup>1</sup> to compute

$$
\mathsf{K}(t\mathbf{x}) = t\mathfrak{t}^{-1}\mathsf{K}(t\mathbf{x}) = t\mathsf{K}(t^{-1}t\mathbf{x}) = t\mathsf{K}(\mathbf{x}).
$$

Also, (5.58) and affinity imply that for any <sup>x</sup>,<sup>y</sup> <sup>∈</sup> *<sup>B</sup>*<sup>3</sup> for which <sup>x</sup>+<sup>y</sup> <sup>∈</sup> *<sup>B</sup>*3, we have

$$\mathcal{K}(\mathbf{x} + \mathbf{y}) = 2\mathsf{K}(\frac{1}{2}\mathbf{x} + \frac{1}{2}\mathbf{y}) = 2 \cdot (\frac{1}{2}\mathsf{K}(\mathbf{x}) + \frac{1}{2}\mathsf{K}(\mathbf{y})) = \mathsf{K}(\mathbf{x}) + \mathsf{K}(\mathbf{y}).\tag{5.59}$$

With our earlier result (5.57), this also gives

$$
\mathbb{K}(-\mathbf{x}) = -\mathbb{K}(\mathbf{x}).\tag{5.60}
$$

For some nonzero <sup>x</sup> <sup>∈</sup> <sup>R</sup>3, take *<sup>s</sup>* ≥ x and *<sup>t</sup>* ≥ x. Then by (5.58) we have

$$s\mathbb{K}(\mathbf{x}/s) = s\mathbb{K}\left(\frac{t}{s}\frac{\mathbf{x}}{t}\right) = t\mathbb{K}(\mathbf{x}/t).$$

We may therefore define a map *<sup>R</sup>* : <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>R</sup><sup>3</sup> by

$$R(\mathbf{0}) = \mathbf{0};\tag{5.61}$$

$$R(\mathbf{x}) = s \cdot \mathsf{K}(\mathbf{x}/s) \ (\mathbf{x} \neq \mathbf{0}),\tag{5.62}$$

for any choice of *<sup>s</sup>* ≥ x. For <sup>x</sup> <sup>∈</sup> *<sup>B</sup>*<sup>3</sup> we may take *<sup>s</sup>* <sup>=</sup> 1, so that *<sup>R</sup>* extends <sup>K</sup>.

To prove that *<sup>R</sup>* is linear, for <sup>x</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> and *<sup>t</sup>* <sup>≥</sup> 0 pick some *<sup>s</sup>* <sup>≥</sup> *<sup>t</sup>*x and compute

$$R(t\mathbf{x}) = s\mathbb{K}\left(\frac{t}{s}\mathbf{x}\right) = s\mathbb{K}\left(\|\mathbf{x}\|\frac{t}{s}\frac{\mathbf{x}}{\|\mathbf{x}\|}\right) = s \cdot \|\mathbf{x}\|\frac{t}{s}\mathbb{K}\left(\frac{\mathbf{x}}{\|\mathbf{x}\|}\right) = tR(\mathbf{x}).\tag{5.63}$$

For *t* < 0, we first show from (5.60) and (5.62) that

$$R(-\mathbf{x}) = -R(\mathbf{x}),\tag{5.64}$$

upon which (5.63) gives

$$R(t\mathbf{x}) = R(|t| \cdot (-\mathbf{x})) = |t|R(-\mathbf{x}) = -|t|R(\mathbf{x}) = -tR(\mathbf{x}).\tag{5.65}$$

Furthermore, for given <sup>x</sup>,<sup>y</sup> <sup>∈</sup> *<sup>B</sup>*3, pick *<sup>s</sup>* <sup>&</sup>gt; 0 such that *<sup>s</sup>* ≥ x and *<sup>s</sup>* ≥ y, so that *s* = 2*s* ≥ x+y by the triangle inequality, and use (5.59) to compute

$$\begin{split} \mathcal{R}(\mathbf{x} + \mathbf{y}) &= s\mathbb{K}\left(\frac{\mathbf{x} + \mathbf{y}}{s}\right) = s\mathbb{K}\left(\frac{\mathbf{x}}{s} + \frac{\mathbf{y}}{s}\right) = s\mathbb{K}(\mathbf{x}/s) + s\mathbb{K}(\mathbf{y}/s) \\ &= \mathcal{R}(\mathbf{x}) + \mathcal{R}(\mathbf{y}). \end{split} \tag{5.66}$$

Finally, *R* is an isometry by (5.62) and step one of the proof. Being also linear and invertible, *R* must therefore be an orthogonal transformation. - Given step one, an alternative proof derives this lemma from Proposition 5.18 below, which shows that the transition probabilities (5.52) on *S*<sup>2</sup> are determined by the convex structure of *B*3, so that affine bijections must preserve them. In other words, the boundary map *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *<sup>S</sup>*<sup>2</sup> defined by <sup>K</sup> preserves transition probabilities and hence satisfies the conditions of Lemma 5.9. This reasoning effectively reduces Kadison's Theorem to Wigner's Theorem, a move we will later examine in general.

In any case, Theorem 5.10 now follows from Lemma 5.11 is exactly the same way as Theorem 5.7 followed from the corresponding Lemma 5.9. -

We have given this proof in some detail, because step 3 will recur on other occasions where a given affine bijection is to be extended to some linear map.

*Ludwig's Theorem* is part 3 of Theorem 5.4. For *H* = C2, we have:

Theorem 5.12. *Each affine order isomorphism* <sup>L</sup> : <sup>E</sup> (C2) <sup>→</sup> <sup>E</sup> (C2) *reads* <sup>L</sup>(*a*) = *uau*∗*, where u is unitary or anti-unitary, and is uniquely fixed by* L *up to a phase.*

*Proof.* Using the parametrization (5.41), we have *<sup>a</sup>*(*x*0, *<sup>x</sup>*1, *<sup>x</sup>*2, *<sup>x</sup>*3) <sup>∈</sup> <sup>E</sup> (C2) iff each *x*<sup>μ</sup> is real and 0 ≤ *x*0±x ≤ 2. In particular, we have 0 ≤ *x*<sup>0</sup> ≤ 2. This easily follows from (2.38), noting that *<sup>a</sup>* <sup>∈</sup> <sup>E</sup> (C2) just means that *<sup>a</sup>*<sup>∗</sup> <sup>=</sup> *<sup>a</sup>* and that both eigenvalues of *a* lie in [0,1]. Thus E (C2) is isomorphic as a convex set to a convex subset *C* of R<sup>4</sup> that is fibered over the *x*0-interval [0,2], where the fiber *Cx*<sup>0</sup> of *C* over *x*<sup>0</sup> is the three-ball *B*<sup>3</sup> *<sup>x</sup>*<sup>0</sup> with radius x = *x*<sup>0</sup> as long as 0 ≤ *x*<sup>0</sup> ≤ 1, whereas for 1 ≤ *x*<sup>0</sup> ≤ 2 the fiber is *B*<sup>3</sup> 2−*x*<sup>0</sup> , so at *<sup>x</sup>*<sup>0</sup> <sup>=</sup> 1 the fiber is *<sup>C</sup>*<sup>1</sup> <sup>=</sup> *<sup>B</sup>*<sup>3</sup> <sup>≡</sup> *<sup>B</sup>*<sup>3</sup> <sup>1</sup> (in one dimension less, this convex body is easily visualizable as a double cone in R3, where the fibers are disks). The partial order on *C* induced from the one on E (C2) is given by

$$\mathbf{x}(\mathbf{x}\_{0}, \mathbf{x}) \le (\mathbf{x}\_{0}^{\prime}, \mathbf{x}^{\prime}) \text{ iff } \mathbf{x}\_{0}^{\prime} - \mathbf{x}\_{0} \ge \|\mathbf{x}^{\prime} - \mathbf{x}\|,\tag{5.67}$$

which follows from (5.41) and (2.38), noting that for matrices one has *a* ≤ *a* iff *a* − *a* has positive eigenvalues. A similar argument to the one proving (5.57) then shows that any affine bijection L of *C* must map the base space [0,2] to itself (as an affine bijection), and hence either *x*<sup>0</sup> → *x*<sup>0</sup> or *x*<sup>0</sup> → 2 − *x*0. The latter fails to preserve order, so L must fix *x*0. Similarly, L maps each three-ball *Cx*<sup>0</sup> to itself by an affine bijection, which, by the same proof as for Kadison's Theorem above, must be induced by some element *Rx*<sup>0</sup> of *O*(3). Finally, the order-preserving condition *x* <sup>0</sup>−*x*<sup>0</sup> ≥ x −x ⇒ *x* <sup>0</sup>−*x*<sup>0</sup> ≥ *Rx* 0 x −*Rx*<sup>0</sup> x obtained from (5.67) and the property L(*x*0) = *x*<sup>0</sup> just found can only be met if *Rx*<sup>0</sup> is independent of *x*0. -

Part 3 of Theorem 5.4 does not carry an official name; it may be attributed to Kadison, too, but the hard part of the proof was given earlier by Jacobson and Rickart. Rather than a contrived (though historically justified) name like "Jacobson–Rickart– Kadison Theorem", we will simply speak of *Jordan's Theorem* (for *H* = C2):

Theorem 5.13. *Each linear bijection* J : *M*2(C)sa → *M*2(C)sa *that satisfies* (5.13) *and hence* (5.12) *takes the form* J(*a*) = *uau*∗*, where u is either unitary or antiunitary, and is uniquely determined by* J *up to a phase.*

*Proof.* First, any Jordan map (and hence *a fortiori* any Jordan automorphism) trivially maps projections into projections, as it preserves the defining conditions *e*<sup>2</sup> = *e*<sup>∗</sup> = *e*. Second, any Jordan automorphism J maps *one-dimensional* projections into *one-dimensional* projections: if *e* ∈ P1(*H*), then J(*e*) = 0 and J(*e*) = 12, both because J is injective in combination with J(0) = 0 and J(12) = 12, respectively. Hence J(*e*) ∈ P1(*H*), since this is the only remaining possibility (a more sophisticated argument shows that this is even true for any Hilbert space *H*). From (5.41) and subsequent text, as in (5.44), by linearity of J we therefore have

$$\mathbb{J}\left(\sum\_{j=1}^{3} x\_j \sigma\_j\right) = \sum\_{j=1}^{3} (\mathbf{R} \mathbf{x})\_j \sigma\_j,\tag{5.68}$$

from some map *<sup>R</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *<sup>S</sup>*2, which is bijective because <sup>J</sup> is. Linearity of <sup>J</sup> then allows us to extend *<sup>R</sup>* to a linear map <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>R</sup>3, with matrix

$$\mathcal{R}\_{jk} = \frac{1}{2} \sum\_{j=1}^{3} \text{Tr} \left( \sigma\_k \mathbf{J} (\sigma\_j) \right), \tag{5.69}$$

cf. (5.45). By (5.69), this linear map restricts to the given bijection *<sup>R</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *<sup>S</sup>*2, which also shows that it is isometric. Thus we have a linear isometry on R3, which therefore lies in *O*(3). The proof may then be completed as in Theorem 5.7. -

The case *H* = C<sup>2</sup> was already exceptional in the context of Gleason's Theorem, and it remains so as far as weak Jordan symmetries and Bohr symmetries are concerned.

Proposition 5.14. *The poset* <sup>C</sup> (*M*2(C)) *is isomorphic to* {⊥}∪RP2*, where the real projective plane* RP<sup>2</sup> *is the quotient S*2/ <sup>∼</sup> *under the equivalence relation* <sup>x</sup> ∼ −x*, and the only nontrivial ordering is* ⊥ ≤ *p for any p* <sup>∈</sup> RP2*.*

*Proof.* It is elementary that *M*2(C) has a single one-dimensional unital <sup>∗</sup>-subalgebra, namely C· 1, the multiples of the unit; this gives the singleton ⊥ in C (*M*2(C)).

Furthermore, any two-dimensional unital <sup>∗</sup>-subalgebra *C* of *M*2(C) is generated by a one-dimensional projection *e*, in that *C* is the linear span of *e* and 12. Hence *C* is also the linear span of (the projection) 12 −*e* and 12. In our parametrization of all one-dimensional projections *e* on C<sup>2</sup> by *S*<sup>2</sup> (cf. Proposition 2.9), if *e* corresponds to <sup>x</sup>, then 1−*<sup>e</sup>* corresponds to <sup>−</sup>x. This yields the remainder RP<sup>2</sup> of <sup>C</sup> (*M*2(C)).

Finally, commutative unital <sup>∗</sup>-subalgebras *D* of *M*2(C) of dimension > 2 do not exist. For any such algebra *D* would contain some two-dimensional *C* just defined, but a simple computation (for example, in a basis were *C* consists of all diagonal matrices) shows that the only matrices that commute with all elements of *C* already lie in *C* (i.e., are diagonal). Hence no commutative extension of *C* exists. -

Bohr symmetries B for C<sup>2</sup> therefore correspond to bijections of RP2. Similarly, weak Jordan symmetries J for C<sup>2</sup> corresponds to bijections of *S*<sup>2</sup> (the difference with Bohr symmetries lies in the fact that J may also map *C* = span(*e*,12) to itself nontrivially, i.e., by sending *e* to 12 −*e*, which for B would yield the identity map). In both cases, few of these bijections are (anti-) unitarily implemented.

#### 5.3 Equivalence between the six symmetry theorems

If dim(*H*) > 1, the first three claims of Theorem 5.4 are equivalent; if dim(*H*) > 2, all claims are. We will show this in some detail, if only because the proofs of the various equivalences relate the six symmetry concepts stated in Definition 5.1 in an instructive way. We will do this in the sequence Wigner ↔ Kadison ↔ Jordan, and subsequently Jordan ↔ Ludwig, Jordan ↔ von Neumann, and Jordan ↔ Bohr. Consequently, in principle only one part of Theorem 5.4 requires a proof. Although redundant, we will, in fact, prove both Wigner's Theorem and Jordan's (indeed, no independent proof of the other parts of Theorem 5.4 seems to be known!). The most transparent way to state the various equivalences is to note that in each case the set of symmetries of some given kind (i.e., Wigner, ...) forms a group. In all cases, the nontrivial part of the proof is the establishment of a "natural" bijection, from which the group homomorphism property is trivial (and hence will not be proved).

Proposition 5.15. *There is an isomorphism of groups between:*


$$\mathcal{W} = \mathsf{K}\_{|\mathcal{P}\_1(H)};\tag{5.70}$$

$$\mathsf{M}\left(\sum\_{i} \mathsf{A}\_{i} e\_{\mathsf{U}\_{l}}\right) = \sum\_{i} \mathsf{A}\_{i} \mathsf{W}(\mathsf{U}\_{\mathsf{U}\_{l}}),\tag{5.71}$$

*where* ρ = ∑*<sup>i</sup>* λ*ie*<sup>υ</sup>*<sup>i</sup> is some (not necessarily unique) expansion of* ρ ∈ D(*H*) *in terms of a basis of eigenvector* υ*<sup>i</sup> with eigenvalues* λ*i, where* λ*<sup>i</sup>* ≥ 0 *and* ∑*<sup>i</sup>* λ*<sup>i</sup>* = 1*. In particular,* (5.70) *and* (5.71) *are well defined.*

*Proof.* It is conceptually important to distinguish between *B*(*H*)sa as a Banach space in the usual operator norm ·, and *B*1(*H*)sa, the Banach space of trace-class operators in its intrinsic norm ·1. Of course, if dim(*H*) < ∞, then *B*(*H*)sa = *B*1(*H*)sa as vector spaces, but even in that case the two norms do not coincide (although they are equivalent). The proof below has the additional advantage of immediately generalizing to the infinite-dimensional case. We start with (5.70).

	- a. Put K1(0) = 0 and for *b* ≥ 0, *b* ∈ *B*1(*H*), i.e. *b* ∈ *B*1(*H*)+, and *b* = 0, define

$$\mathsf{K}\_{\mathsf{I}}(b) = \|b\|\,\_{\mathsf{I}}\mathsf{K}(b/\|b\|\|\_{\mathsf{I}}).\tag{5.72}$$

By construction, K<sup>1</sup> is isometric and preserves positivity. For *b* ∈ *B*1(*H*)+ we have Tr(*b*) = *b*1, hence *b*/*b*<sup>1</sup> ∈ D(*H*), on which K is defined.

Linearity of K<sup>1</sup> with positive coefficients (as a consequence of the affine property of K) is verified as in the proof of Lemma 5.11; this time, use

$$a + b = \left( ||a||\_1 + ||b||\_1 \right) \cdot \left( t \frac{a}{||a||\_1} + (1 - t) \frac{b}{||b||\_1} \right),\tag{5.73}$$

with *t* = *a*1/(*a*1+*b*1). Note that if *a*,*b* ∈ *B*1(*H*)+, then *a*+*b* ∈ *B*1(*H*)+. b. For *b* ∈ *B*1(*H*)sa, decompose *b* = *b*<sup>+</sup> − *b*−, where *b*<sup>±</sup> ≥ 0; see Proposition

A.24 (this remains valid in general Hilbert spaces). We then define

$$
\mathsf{K}\_{\mathsf{I}}(\mathsf{b}) = \mathsf{K}\_{\mathsf{I}}(\mathsf{b}\_{+}) - \mathsf{K}\_{\mathsf{I}}(\mathsf{b}\_{-}).\tag{5.74}
$$

To show that this makes K<sup>1</sup> linear on all of *B*1(*H*)sa, suppose *b* = *b* <sup>+</sup> − *b* − with *b* <sup>±</sup> ≥ 0. Then *b* <sup>+</sup> +*b*<sup>−</sup> = *b*<sup>+</sup> +*b* <sup>−</sup>, and since each term is positive,

$$
\mathbb{K}\_1(b'\_+ + b\_-) = \mathbb{K}\_1(b'\_+) + \mathbb{K}\_1(b\_-) = \mathbb{K}(b\_+ + b'\_-) = \mathbb{K}\_1(b\_+) + \mathbb{K}\_1(b'\_-),
$$

by the previous step. Hence K1(*b* +) − K1(*b* <sup>−</sup>) = K1(*b*+) − K1(*b*−), so that (5.74) is actually independent of the choice of the decomposition of *b* as long as the operators are positive. Hence for *a*,*b* ∈ *B*1(*H*)sa we may compute

$$\begin{aligned} \mathsf{K}\_{1}(a+b) &= \mathsf{K}\_{1}(a\_{+}+b\_{+}-(a\_{-}+b\_{-})) = \mathsf{K}\_{1}(a\_{+}+b\_{+}) - \mathsf{K}\_{1}(a\_{-}+b\_{-}) \\ &= \mathsf{K}\_{1}(a\_{+}) + \mathsf{K}\_{1}(b\_{+}) - \mathsf{K}\_{1}(a\_{-}) - \mathsf{K}\_{1}(b\_{-}) = \mathsf{K}\_{1}(a) + \mathsf{K}\_{1}((b),b) \end{aligned}$$

since *a*<sup>+</sup> +*b*<sup>+</sup> and *a*<sup>−</sup> +*b*<sup>−</sup> are both positive.

The key point in verifying isometry of K<sup>1</sup> is the property |*b*| = *b*<sup>+</sup> + *b*−, which follows from (A.76) or Theorem B.94. Using this property, we have

$$\begin{aligned} \|\mathsf{K}\_{1}(b)\|\_{1} &= \mathrm{Tr}\left(|\mathsf{K}\_{1}b|\right) = \mathrm{Tr}\left(|\mathsf{K}\_{1}(b\_{+}) - \mathsf{K}\_{1}(b\_{-})|\right) = \mathrm{Tr}\left(\mathsf{K}\_{1}(b\_{+}) + \mathsf{K}\_{1}(b\_{-})\right) \\ &= \mathrm{Tr}\left(b\_{+} + b\_{-}\right) = \mathrm{Tr}\left(|b\_{+} - b\_{-}|\right) = \mathrm{Tr}\left(|b|\right) = \|b\|\_{1} .\end{aligned}$$

3. For any two unit vectors ψ,ϕ in *H* we have the formula

$$\|\|e\_{\Psi} - e\_{\Phi}\|\|\_{1} = 2\sqrt{1 - \text{Tr}\left(e\_{\Psi}e\_{\Phi}\right)},\tag{5.75}$$

which can easily be proved by a calculation with 2×2 matrices (since everything takes place is the two-dimensional subspace spanned by ψ and ϕ, expect when ϕ = *z*ψ, *z* ∈ T, in which case (5.75) reads 0 = 0 and hence is true also). Since K<sup>1</sup> is linear as well as isometric with respect to the trace-norm, we have

$$\|\|\mathsf{K}\_{1}(e\_{\Psi}) - \mathsf{K}\_{1}(e\_{\Phi})\|\|\_{1} = \|\|\mathsf{K}\_{1}(e\_{\Psi} - e\_{\Phi})\|\|\_{1} = \|\|e\_{\Psi} - e\_{\Phi}\|\|\_{1},$$

and hence, by (5.75), Tr(K1(*e*ψ)K1(*e*ϕ)) = Tr(*e*ψ*e*ϕ). Eq. (5.70) then gives (5.8).

We move on to (5.71). The main concern is that this expression be well defined, since in case some eigenvalue λ > 0 of ρ is degenerate (necessarily with finite multiplicity, even in infinite dimension, since ρ is compact), the basis of the eigenspace *H*<sup>λ</sup> that takes part in the sum ∑*<sup>i</sup>* λ*ie*<sup>υ</sup>*<sup>i</sup>* is far from unique. This is settled as follows:

Lemma 5.16. *Let* W : P1(*H*) → P1(*H*) *be a bijection that satisfies* (5.8)*, let L* ⊂ *H be a (finite-dimensional) subspace, and let* (υ*j*) *and* (υ *<sup>i</sup>*) *be bases of L. Then*

$$\sum\_{j} \mathsf{W}(e\_{\mathsf{v}\_{j}}) = \sum\_{i} \mathsf{W}(e\_{\mathsf{v}\_{i}'}).\tag{5.76}$$

*Proof.* As usual, for projections *e* and *f* on *H* we write *e* ≤ *f* iff *eH* ⊆ *f H*. From (B.212) and (B.214) we have <sup>∑</sup>*<sup>j</sup>* |υ*j*,ψ|<sup>2</sup> <sup>≤</sup> 1 for any unit vector <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*, with equality iff ψ ∈ *L*. In other words, *e*<sup>ψ</sup> ≤ *eL* iff ∑*<sup>j</sup>* Tr(*e*υ*<sup>j</sup> e*ψ) = 1. Furthermore, by (5.8) the images W(*e*υ*<sup>j</sup>* ) remain orthogonal; hence ∑*<sup>j</sup>* W(*e*υ*<sup>j</sup>* ) is a projection, and *e* ≤ ∑*<sup>j</sup>* W(*e*υ*<sup>j</sup>* ) iff ∑*<sup>j</sup>* Tr(W(*e*υ*<sup>j</sup>* )*e*) = 1. By (5.8), this condition is satisfied for *e* = W(*e*υ*<sup>i</sup>* ), so that W(*e*<sup>υ</sup> *i* ) ≤ ∑*<sup>j</sup>* W(*e*υ*<sup>j</sup>* ) for each *j*. Since also the projections W(*e* υ*i* ) are orthogonal, this gives ∑*<sup>i</sup>* W(*e* υ*i* ) ≤ ∑*<sup>j</sup>* W(*e*υ*<sup>j</sup>* ). Interchanging the roles of the two bases gives the converse, yielding (5.76). -

Finally, to prove bijectivity of the correspondence K ↔ W, we need the property

$$\mathsf{K}\left(\sum\_{i} \lambda\_{i} e\_{\mathsf{U}\_{\mathsf{U}}}\right) = \sum\_{i} \lambda\_{i} \mathsf{K}(e\_{\mathsf{U}\_{\mathsf{U}}}),\tag{5.77}$$

since this implies that K is determined by its action on P1(*H*) ⊂ D(*H*). In finite dimension this follows from convexity of K, and we are done. In infinite dimension, we in addition need continuity of K, as well as convergence of the sum ∑*<sup>i</sup>* λ*ie*<sup>υ</sup>*<sup>i</sup>* not only in the operator norm (as follows from the spectral theorem for self-adjoint compact operators), but also in the trace norm: for finite *n*,*m*,

$$\|\sum\_{i=n}^{m} \lambda\_i e\_{\mathfrak{v}\_i} \|\_1 \le \sum\_{i=n}^{m} |\lambda\_i| \|e\_{\mathfrak{v}\_i} \|\_1 = \sum\_{i=n}^{m} \lambda\_i,$$

since *e*υ*<sup>i</sup>* <sup>1</sup> = 1. Because ∑*<sup>i</sup>* λ*<sup>i</sup>* = 1, the above expression vanishes as *n*,*m* → ∞, whence ρ*<sup>n</sup>* = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> λ*ie*<sup>υ</sup>*<sup>i</sup>* is a Cauchy sequence in *B*1(*H*), which by completeness of the latter converges (to an element of D(*H*), as one easily verifies).

The proof of continuity is completed by noting that K is continuous with respect to the trace norm, for it is isometric and hence bounded (see step 2 above). -

It is enlightening to give a rather more conceptual proof that K|P1(*H*) satisfies (5.8), which is based on a result to be used more often in the future. In what follows, for any convex set *C*, the notation *Ab*(*K*) stands for the real vector space of *bounded* affine functions *f* : *C* → R, that is, bounded functions satisfying

$$f(\mathbf{t}\mathbf{x} + (1-t)\mathbf{y}) = t f(\mathbf{x}) + (1-t)f(\mathbf{y}), \; \mathbf{x}, \mathbf{y} \in \mathcal{C}, t \in (0,1). \tag{5.78}$$

It is easily checked that *Ab*(*K*) with the supremum-norm is a real Banach space.

Proposition 5.17. *For any Hilbert space H we have an isometric isomorphism*

$$A\_b(\mathcal{O}(H)) \cong B(H)\_{\text{sa}},\tag{5.79}$$

$$f \leftrightarrow a;\tag{5.80}$$

$$f(\mathfrak{p}) = \text{Tr}(\mathfrak{p}a),\tag{5.81}$$

*which preserves the unit (i.e.,* 1D(*H*) ↔ 1*H) as well as the order (i.e, f* ≥ 0 *iff a* ≥ 0*).*

Note that under the identification D(*H*) ∼= *Sn*(*B*(*H*)) (where in finite dimension the normal state space *Sn*(*B*(*H*)) simply coincides with the state space *S*(*B*(*H*))), where ρ ↔ ω as in (2.33), i.e., ω(*a*) = Tr(ρ*a*), the above isomorphism simply reads

$$A\_b(S\_n(B(H))) \cong B(H)\_{\text{sa}},\tag{5.82}$$

$$
\hat{a} \leftrightarrow a;\tag{5.83}
$$

$$
\hat{a}(\mathfrak{o}) = \mathfrak{o}(a). \tag{5.84}
$$

*Proof.* It is clear that for each *a* ∈ *B*(*H*)sa the function *f* : ρ → Tr(ρ*a*) (or, equivalently, ˆ*a* : ω → ω(*a*)) is affine as well as real-valued, and is bounded by (A.100) (supplemented, if dim(*H*) = ∞, by Lemma B.142), noting that ρ<sup>1</sup> = 1 for ρ ∈ D(*H*), and in fact (B.483) yields the equality *f* <sup>∞</sup> = *a* (or *a*ˆ<sup>∞</sup> = *a*).

Conversely, *f* ∈ *Ab*(D(*H*)) defines a function *Q* : *H* → R by

$$\mathcal{Q}(0) = 0;\tag{5.85}$$

$$\mathcal{Q}(\Psi) = ||\Psi||^2 f(e\_{\Psi/\|\Psi\|}) \ (\Psi \neq 0). \tag{5.86}$$

This function is clearly bounded on the unit ball of *H*, as in

$$|\mathcal{Q}(\Psi)| \le \|f\|\_{\circ} \|\Psi\|^2. \tag{5.87}$$

To check that *Q* in fact defines a quadratic form on *H*, we verify the properties (A.8) - (A.9). The first is trivial. The second follows from the easily verified identity

$$
\epsilon\_{\frac{\nu+w}{\|\nu+w\|}} + (1-t)e\_{\frac{\nu-w}{\|\nu-w\|}} = \epsilon e\_{\frac{\nu}{\|\nu\|}} + (1-s)e\_{\frac{w}{\|w\|}},\tag{5.88}
$$

where *v*,*w* = 0, *v* = *w*, and the coefficients *s*,*t* are given by

$$t = \frac{\|\nu + \boldsymbol{w}\|^2}{2(\|\boldsymbol{\nu}\|^2 + \|\boldsymbol{w}\|^2)};\tag{5.89}$$

$$s = \frac{\|\nu\|^2}{\|\nu\|^2 + \|\nu\|^2}. \tag{5.90}$$

The affine property (5.78) then immediately yields (A.9). According to Proposition B.79, we obtain a unique operator *a* ∈ *B*(*H*)sa such that *Q*(ψ) = ψ,*a*ψ, i.e.,

$$
\langle \Psi, a\Psi \rangle = f(e\_{\Psi}), \ \Psi \in H,\\
$$

Since also ψ,*a*ψ = Tr(*e*ψ*a*), we have established (5.81) for each ρ = *e*ψ, where ψ ∈ *H*,ψ = 1. To extend this result to general density operators ρ = ∑*<sup>i</sup>* λ*ie*<sup>υ</sup>*<sup>i</sup>* , we use (A.100) as well as convergence of the above sum in the trace norm ·1, cf. the proof of Lemma 5.16; the details are analogous to the proof of Theorem B.146. -

Proposition 5.18. *For any unit vectors* ψ,ϕ ∈ *H we have*

$$\operatorname{Tr}(e\_{\Psi}e\_{\Phi}) = \inf \{ f(e\_{\Psi}) \mid f \in A\_b(\mathcal{Q}(H)), 0 \le f \le 1, f(e\_{\Phi}) = 1 \}. \tag{5.92}$$

The virtue of this formula is that the expression on the left-hand side, which defines the transition probabilities on ∂*e*D(*H*) = P1(*H*), is intrinsically given by the convex structure of D(*H*). Consequently, any affine bijection of this convex set (which already preserves the boundary) must preserve these probabilities.

*Proof.* By the previous proposition, eq. (5.92) is equivalent to

$$\operatorname{Tr}(e\_{\Psi}e\_{\Phi}) = \inf \{ \langle \Psi, a\Psi \rangle \mid a \in B(H)\_{\mathrm{sa}}, 0 \le a \le 1, \langle \Phi, a\Phi \rangle = 1 \}. \tag{5.93}$$

Since Tr(*e*ψ*e*ϕ) = ψ, *e*ϕψ, we are ready if we can show that the infimum is reached at *a* = *e*ϕ. Therefore, we prove that for any *a* as specified we must have ψ,*a*ψ ≥ Tr(*e*ψ*e*ϕ) = |ϕ,ψ|2. To do so, we are going to find a contradiction if

$$
\langle \Psi, a\Psi \rangle < \text{Tr} \left( e\_{\Psi} e\_{\Phi} \right), \tag{5.94}
$$

for some such *a*. Indeed, ϕ,*a*ϕ = 1 with *a* ≤ 1 (which follows from 0 ≤ *a* ≤ 1) and ϕ = 1 imply, by Cauchy–Schwarz, that *a*ϕ = ϕ. Since *a*<sup>∗</sup> = *a* (by positivity of *a*), we also have *a* : (C·ϕ)<sup>⊥</sup> → (C·ϕ)⊥, so we may write *a* = *e*<sup>ϕ</sup> +*a* , with *a* ϕ = 0 and *a* mapping (C · ϕ)<sup>⊥</sup> to itself. Then *a* ≥ 0 implies *a* ≥ 0. If (5.94) holds, then ψ,*a* ψ < 0, which contradicts positivity of *a* (and hence of *a*). -

We now turn to the equivalence between Jordan's Theorem and Kadison's Theorem.

Proposition 5.19. *There is an isomorphism of groups between:*


*such that for any a* ∈ *B*(*H*)sa *one has*

$$\operatorname{Tr}(\mathsf{K}(\rho)a) = \operatorname{Tr}(\rho \mathsf{J}(a)) \ (\rho \in \mathcal{O}(H)).\tag{5.95}$$

This immediately follows from the following lemma (of independent interest):

Lemma 5.20. *1. There is a bijective correspondence between:*


*such that for any a* ∈ *B*(*H*)sa *one has* (5.95)*.*

*2. A map* α : *B*(*H*) → *B*(*H*) *is a unital positive linear bijection iff it is a Jordan automorphism.*

*Proof.* 1. An affine bijection K : D(*H*) → D(*H*) induces an isomorphism

$$\mathsf{K}^\* : A\_b(\mathcal{Q}(H)) \to A\_b(\mathcal{Q}(H)); \tag{5.96}$$

$$f \mapsto f \circ \mathsf{K},\tag{5.97}$$

which is evidently unital, positive, and isometric. Consequently, by Proposition 5.17, K<sup>∗</sup> corresponds to some isomorphism α : *B*(*H*)sa → *B*(*H*)sa, which necessarily shares the properties of being unital, positive, and isometric; this follows abstractly from the proposition, but may also be verified directly from (5.95). Conversely, such a map α yields a map K directly by (5.95); to see this, we identify D(*H*) with the normal state space of *B*(*H*) through ρ ↔ ω, as usual, cf. (2.33), and note that Kω is the state defined by (Kω)(*a*) = ω(α(*a*)), or briefly Kω = ω ◦α. This is often written as K = α∗, and for future reference we write

$$
\alpha^\* \mathfrak{o}(a) = \mathfrak{o}(\mathfrak{a}(a)).\tag{5.98}
$$

	- a. Unital positive linear maps maps on *B*(*H*)sa preserve P(*H*), cf. (2.164).
	- b. Any two projections *e* and *f* are orthogonal (*e f* = 0) iff *e*+ *f* ≤ 1*<sup>H</sup>* (easy).
	- c. Any *a* ∈ *B*(*H*)sa is a norm-limit of finite sums of the kind ∑*<sup>i</sup>* λ*iei*, where λ*<sup>i</sup>* ∈ R and the *ei* are mutually orthogonal projections (this follows from the spectral theorem for bounded self-adjoint operators in the form of Theorem B.104)
	- d. Any unital positive linear map α : *B*(*H*)sa → *B*(*H*)sa is continuous. Since

$$-||a|| \cdot 1\_H \le a \le -||a|| \cdot 1\_H \ (a \in B(H)\_{\text{sa}}),\tag{5.99}$$

by (C.83), applying the positive map α and using α(1*H*) = 1*<sup>H</sup>* yields

$$-||a|| \cdot 1\_H \le \mathcal{a}(a) \le -||a|| \cdot 1\_H.$$

This is possible only if α(*a*)≤*a*, and hence α is continuous with norm bounded by α ≤ 1. In fact, since *a* is unital we have α = 1.

Therefore, any unital positive linear map α preserves orthogonality of projections, so if *a* = ∑*<sup>i</sup>* λ*iei* (finite sum), then

$$\alpha(a^2) = \alpha\left(\sum\_{i} \lambda\_i^2 e\_i\right) = \sum\_{i} \lambda\_i^2 \alpha(e\_i) = \sum\_{i,j} \lambda\_i \lambda\_j \alpha(e\_i) \alpha(e\_j) = \alpha(a)^2,\quad(5.100)$$

since *eiej* = δ*i je <sup>j</sup>* and by the above comment also α(*ei*)α(*ej*) = δ*i j*α(*ej*). By continuity of α, this property extends to arbitrary *a* ∈ *B*(*H*)sa. Finally, since

$$a \diamond b = \frac{1}{2}((a+b)^2 - a^2 - b^2),\tag{5.101}$$

preserving squares as in (5.100) implies preserving the Jordan product ◦. -

We now turn to the equivalence between Ludwig symmetries and Jordan ones.

Proposition 5.21. *There is an isomorphism of groups between:*


*Proof.* Since L is an order isomorphism, it satisfies L(0) = 0 (as well as L(1*H*) = 1*H*), since 0 is the bottom element of E (*H*) as a poset (and 1*<sup>H</sup>* is its the top element). As in the proof of Lemma 5.11, one shows that this property plus convexity implies L(*ta*) = *t*L(*a*) and L(*a*+*b*) = L(*a*) +L(*b*) whenever defined. Defining J by

$$\mathbf{J}(0) = 0;\tag{5.102}$$

$$\mathsf{J}(a) = \mathsf{s} \cdot \mathsf{L}(a/\mathsf{s}) \ (a > 0, \mathsf{s} \ge \|a\|\mathsf{I});\tag{5.103}$$

$$\mathsf{J}(a) = -\mathsf{J}(-a) \ (a < 0),\tag{5.104}$$

where *a* > 0 means *a* ≥ 0 and *a* = 0, and *a* < 0 means −*a* ≥ 0 and *a* = 0, once again the reasoning near the end of the proof of Lemma 5.11 shows that J is linear; it is a untital order-preserving bijection by construction. Hence J is a Jordan automorphism by Lemma 5.20.2 Of course, instead of (5.104) one could equivalently have defined J on general *a* ∈ *B*(*H*)sa by J(*a*) = J(*a*+)−J(*a*−), using the (by now hopefully familiar) decomposition *a* = *a*<sup>+</sup> −*a*<sup>−</sup> with *a*<sup>±</sup> ≥ 0 and *a*+*a*<sup>−</sup> = 0.

Conversely, once again using Lemma 5.20.2, a Jordan automorphisms (5.11) preserves order as well as the unit, so that the inequality 0 ≤ *a* ≤ 1*<sup>H</sup>* characterizing *a* ∈ E (*H*) is preserved, i.e., 0 ≤ J(*a*) ≤ 1*H*. Thus J preserves E (*H*), where it preserves order. Convexity is obvious, since L = J|<sup>E</sup> (*H*) comes from a linear map. -

The equivalence between Jordan's Theorem and von Neumann's Theorem (provided dim(*H*) ≥ 3) hinges on the following corollary of Gleason's Theorem (cf. §D.1).

Corollary 5.22. *Let* dim(*H*) > 2*. Then an isomorphism* N *of* P(*H*) *as an orthocomplemented lattice has a unique extension to a linear map* α : *B*(*H*)sa → *B*(*H*)sa*, which is (automatically) invertible, unital, and positive.*

*Proof.* According to Lemma D.2, N preserves all suprema in P(*H*). Since we have ∑*<sup>i</sup> ei* = *ei* for any family of mutually orthogonal projections and since N by definition preserves the orthocomplementation *e*<sup>⊥</sup> = 1−*e* and hence preserves orthogonality of projections, we may compute

$$\mathbb{N}\left(\sum\_{i} e\_i\right) = \mathbb{N}\left(\bigvee\_{i} e\_i\right) = \bigvee\_{i} \mathbb{N}(e\_i) = \sum\_{i} \mathbb{N}(e\_i). \tag{5.105}$$

Consequently, for any normal state ω on *B*(*H*), the map *e* → ω◦N(*e*) is a probability measure on P(*H*), which by Gleason's Theorem has a unique linear extension to *B*(*H*) and hence *a fortiori* to *B*(*H*)sa. We use this in order to define α, as follows.

First, let *a* ∈ *B*(*H*)sa and suppose *a* = ∑*<sup>j</sup>* λ*<sup>j</sup> fj* for some *finite* family (*fj*) of projections (not necessarily orthogonal), and some λ*<sup>j</sup>* ∈ R. Then ∑*<sup>j</sup>* λ*j*N(*fj*) is independent of the particular decomposition of *a* that has been chosen, so we may put

144 5 Symmetry in quantum mechanics

$$\alpha(a) = \sum\_{j} \mathbb{X}\_{j} \mathbb{N}(f\_{j}).\tag{5.106}$$

To see this, put *a* = ∑*<sup>j</sup>* λ *j f <sup>j</sup>* and hence α (*a*) = ∑*<sup>j</sup>* λ *<sup>j</sup>*N(*f <sup>j</sup>*), and suppose α (*a*) = α(*a*). By (B.477) there exists a normal state ω such that ω(α (*a*)) = ω(α(*a*)); indeed, each element of *B*1(*H*) is a linear combination of at most four density operators, so that each normal linear functional on *B*(*H*) is a linear combination of at most four normal states. But since ω ◦N is linear, this implies ω ◦N(*a*) = ω ◦N(*a*), which is a contradiction. Hence α (*a*) = α(*a*) and accordingly, (5.106) is well defined. Because it is independent of the decomposition of *a* into projections, α is linear: if *a* = ∑*<sup>j</sup>* λ*<sup>j</sup> fj* and *a* = ∑*<sup>j</sup>* λ *j f <sup>j</sup>* , then *a*+*a* = ∑*<sup>j</sup>* λ*<sup>j</sup> fj* +∑*<sup>j</sup>* λ *j f <sup>j</sup>* , so that

$$\mathbb{N}(a+a') = \mathbb{N}\left(\sum\_{j} \lambda\_j f\_j + \sum\_{j'} \lambda'\_{j'} f'\_{j'}\right) = \sum\_{j} \lambda\_j \mathbb{N}(f\_j) + \sum\_{j'} \lambda'\_{j'} \mathbb{N}(f'\_{j'}) = \mathbb{N}(a) + \mathbb{N}(a').$$

Similarly, for any *t* ∈ R we have

$$\mathbb{N}(ta) = \mathbb{N}\left(\sum\_{j} t\lambda\_{j} f\_{j}\right) = \sum\_{j} t\lambda\_{j}\mathbb{N}(f\_{j}) = t\sum\_{j} \lambda\_{j}\mathbb{N}(f\_{j}) = t\mathbb{N}(a).$$

We may now extend α to all of *B*(*H*)sa by continuity. Indeed, according to the spectral theorem in the form (B.326), the set of all operators of the form *a* = ∑*<sup>j</sup>* λ*<sup>j</sup> fj* with all *fj* mutually orthogonal (so that *a* is given by its spectral resolution) is normdense in *B*(*H*)sa. Applying (5.106), and noting that *a* = sup*<sup>j</sup>* |λ*j*|, we may estimate

$$\|\|\alpha(a)\|\| = \|\sum\_{j} \mathbb{A}\_{j} \mathsf{N}(f\_{j})\| \le \sup\_{j} \{|\mathbb{A}\_{j}| \} \|\sum\_{j} \mathsf{N}(f\_{j})\| \le \|a\|,$$

since the N(*fj*) are mutually orthogonal and hence sum to some projection, which has norm 1 (unless *a* = 0). For general *a* ∈ *B*(*H*)sa, we may therefore define N by N(*a*) = lim*<sup>n</sup>* N(*an*), where each *an* is of the above (spectral) form and *an* −*a* → 0.

To prove that α is positive, we show that α(*a*) ≥ 0 whenever *a* ≥ 0. As in the preceding step, initially suppose that *a* = ∑*<sup>j</sup>* λ*<sup>j</sup> fj* has a finite spectral resolution. Then *a* ≥ 0 iff λ*<sup>j</sup>* ≥ 0 for each *j*, and hence α(*a*) ≥ 0 by (5.106), since by orthogonality of the N(*fj*) this equation states the spectral resolution of α(*a*). Now if *an* ≥ 0 and *an* → *a* (in norm), then ψ,*an*ψ→ψ,*a*ψ, which must remain positive, so that *a* ≥ 0. Hence positivity of α on all of *B*(*H*)sa follows by continuity.

Finally, α inherits invertibility from N, and it is unital by (5.105), taking *ei* = |υ*i*υ*i*| for some basis (υ*i*) of *H* (or using the fact that it preserves # = 1*H*). -

Subsequently, we use Lemma 5.20 to further extend α by complex linearity to a Jordan isomorphism of *B*(*H*); see Definition 5.1.

Finally, the equivalence between weak Jordan symmetries and Bohr symmetries follows from Hamhalter's Theorem 9.4, whereas Theorem 9.7 strengthens this to an equivalence between Jordan symmetries and Bohr symmetries. The proof of these theorems does not seem to simplify in the special case at hand, i.e. *A* = *B*(*H*).

#### 5.4 Proof of Jordan's Theorem

In view of the equivalence between the six parts of Theorem 5.4, we only need to prove one of them. In the literature, one only finds proofs of Jordan's Theorem and of Wigner's Theorem, and we present each of these (surprisingly but instructively, these proofs look completely different). We start with *Jordan's Theorem*:

Theorem 5.23. *Any Jordan automorphism* J<sup>C</sup> *of B*(*H*) *is given by either*

$$\mathsf{J}\_{\mathbb{C}}(a) = \mathfrak{a}\_{\mathfrak{u}}(a) \equiv \mathsf{u}a\mathfrak{u}^\*,\tag{5.107}$$

*where u is unitary (and is determined by* J<sup>C</sup> *up to a phase), or by*

$$\mathsf{J}\_{\mathbb{C}}(a) = \mathsf{a}\_{\mathsf{u}}'(a) \equiv \mathsf{u}a^\*\mathsf{u}^\*,\tag{5.108}$$

*where u is anti-unitary (and is determined by* J<sup>C</sup> *up to a phase, too).*

The difficult part of the proof is Theorem C.175, which implies:

Proposition 5.24. *A Jordan automorphism* α *of B*(*H*) *is either an automorphism or an anti-automorphism.*

Recall that an *automorphism* of *B*(*H*) is a linear bijection α : *B*(*H*) → *B*(*H*) that satisfies α(*a*∗) = α(*a*)∗ and α(*ab*) = α(*a*)α(*b*); an *anti-automorphism*, on the other hand, satisfies the first property whilst the latter is replaced by α(*ab*) = α(*b*)α(*a*). Clearly, both automorphisms and anti-automorphisms are Jordan automorphisms. Granting this result, we may deal with the two cases separately.

Proposition 5.25. *Any automorphism* α : *B*(*H*) → *B*(*H*) *takes the form* α = α*u, see* (5.107)*, where u* : *H* → *H is unitary, uniquely determined by* α *up to a phase.*

The proof uses the following lemmas. The first follows from Theorem C.62.4.

Lemma 5.26. *If* α : *B*(*H*) → *B*(*H*) *is an automorphism and a* ∈ *B*(*H*)*, then*

$$\|\|\mathfrak{a}(a)\|\| = \|a\|. \tag{5.109}$$

Lemma 5.27. *If* α : *B*(*H*) → *B*(*H*) *is an automorphism and e* ∈ *B*(*H*) *is a onedimensional projection, then so is* α(*e*)*.*

*Proof.* It should be obvious that automorphisms α preserve projections *e* (whose defining properties are *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*). Furthermore, <sup>α</sup> preserves order, i.e., if *<sup>a</sup>* <sup>≥</sup> <sup>0</sup> (in that, as always, ψ,*a*ψ ≥ 0 for each ψ ∈ *H*, or, equivalently, *a* = *b*∗*b*), then α(*a*) ≥ 0 (this is clear from the second way of expressing positivity). Consequently, if *a* ≤ *b* (in that *b*−*a* ≥ 0), then α(*a*) ≤ α(*b*). We notice that if we define *e* ≤ *f* iff *eH* ⊆ *f H*, then *e* ≤ *f* iff *e* ≤ *f* as self-adjoint operators (in that ψ, *e*ψ≤ψ, *f*ψ for each ψ ∈ *H*); see Proposition C.170. With respect to the ordering ≤ the onedimensional projections *e* are *atomic*, in the sense that 0 ≤ *e* (but *e* = 0) and if 0 ≤ *f* ≤ *e*, then either *f* = 0 or *f* = *e*. Now automorphisms of the projection lattice *B*(*H*) restrict to isomorphisms of P(*H*), which preserve atoms (as these are intrinsically defined by the partial order). - We are now ready for the (constructive!) proof of Proposition 5.25.

*Proof.* For some fixed unit vector χ ∈ *H*, take the corresponding one-dimensional projection *e*<sup>χ</sup> and define a new unit vector ϕ (up to a phase) by

$$e\_{\mathfrak{P}} = \mathfrak{a}^{-1}(e\_{\mathfrak{I}}).\tag{5.110}$$

Now any ψ ∈ *H* may be written as ψ = *a*ϕ, for some *a* ∈ *B*(*H*). Attempt to define an operator *u* by *u*ψ = α(*a*)χ, i.e.,

*ua*ϕ = α(*a*)χ. (5.111)

This looks dangerously ill-defined, since many different operators *a* may give rise to the same ψ. Fortunately, we may compute

$$\begin{aligned} ||a\Phi||\_H &= ||ae\_{\Phi}\Phi||\_H = ||ae\_{\Phi}||\_{\mathcal{B}(H)} = ||\alpha(ae\_{\Phi})||\_{\mathcal{B}(H)} \\ &= ||\alpha(a)\alpha(e\_{\Phi})||\_{\mathcal{B}(H)} = ||\alpha(a)e\_{\mathcal{X}}||\_{\mathcal{B}(H)} = ||\alpha(a)\mathcal{X}||\_H \\ &= ||ua\Phi||\_H, \end{aligned}$$

so that if *a*ϕ = *b*ϕ, then α(*a*)χ = α(*b*)χ and hence *u* is well defined. By this computation *u* is also isometric and since it is clearly surjective, it is unitary. The property α(*a*) = *uau*∗ is equivalent to *ua* = α(*a*)*u*, which in turn is equivalent to *uab*ϕ = α(*a*)*ub*ϕ for any *b* ∈ *B*(*H*), which by definition of *u* is the same as

$$
\alpha(ab)\mathfrak{X} = \alpha(a)\mathfrak{a}(b)\mathfrak{X}.\tag{5.112}
$$

But this holds by virtue of α being an automorphism. Finally, all arbitrariness in *u* lies in the lack of uniqueness of ϕ given its definition (5.110). -

Proposition 5.28. *Any antiautomorphism* α : *B*(*H*) → *B*(*H*) *takes the form* α = α*u, cf.* (5.108)*, where u* : *H* → *H is anti-unitary, uniquely determined by* α *up to a phase.*

*Proof.* Pick an arbitrary anti-unitary operator *J* : *H* → *H* and define

$$
\begin{aligned}
\beta: \mathcal{B}(H) &\to \mathcal{B}(H); \\
\mathcal{B}(a) &= J a^\* J^\*.
\end{aligned}
\tag{5.113}
$$

Then α ◦ β is an automorphism, to which Proposition 5.25 applies, so that

$$
\alpha \circ \beta (a) = \|a\|^\*,\tag{5.114}
$$

for some unitary ˜*u*. Hence

$$\alpha(a) = \alpha(\mathcal{B} \circ \mathcal{B}^{-1}(a)) = \alpha \circ \mathcal{B}(J^\* a^\* J) = \|J^\* a^\* J\|^\*,$$

so that α(*a*) = *ua*∗*u*∗ with *u* = *uJ*˜ ∗.

The precise lack of uniqueness of *u* is inherited from the unitary case. -

#### 5.5 Proof of Wigner's Theorem

We recall *Wigner's Theorem*, i.e. Theorem 5.4.1:

Theorem 5.29. *Each bijection* W : P1(*H*) → P1(*H*) *that satisfies*

$$\operatorname{Tr}(\mathsf{W}(e)\mathsf{W}(f)) = \operatorname{Tr}(ef), \ (e, f \in \mathcal{P}\_1(H)),\tag{5.115}$$

*is given by* W(*e*) = *ueu*<sup>∗</sup> ≡ α*u*(*e*)*, where the operator u is either unitary or antiunitary, and is uniquely determined by* W *up to a phase.*

The problem is to lift a given map W : P1(*H*) → P1(*H*) that satisfies (5.115) to either a unitary or an anti-unitary map *u* : *H* → *H* such that

$$\mathcal{W}(e\_{\Psi}) = e\_{\mu\Psi} = \mu e\_{\Psi} \mu^\*.\tag{5.116}$$

Suppose W(*e*ψ) = *e*<sup>ψ</sup> . Since *ez*<sup>ψ</sup> = *e*<sup>ψ</sup> for any *z* ∈ T, and likewise for *e*<sup>ψ</sup> , this means that *u*ψ = *z*ψ for some *z* ∈ T; the problem is to choose the *z*'s coherently all over the unit sphere of *H*. There are many proofs in the literature, of which the following one—partly based on an earlier proof by Bargmann (1964)—has the advantage of making at least the construction of *u* explicit (at the cost of opaque proofs of some crucial lemma's). We assume dim(*H*) > 2, since *H* = C<sup>2</sup> has already been covered.

Fix unit vectors ψ ∈ *H* and ψ ∈ W(*e*ψ)*H*; clearly, ψ is unique up to multiplication by *z* ∈ T, whose choice turns out to completely determine *u* (i.e., the ambiguity in ψ is the only one in the entire construction). For a modest start, we put

$$
u \Psi = \Psi'.\tag{5.117}$$

Lemma 5.30. *If V* ⊂ *H is a k-dimensional subspace (where k* < ∞*), then there is a unique k-dimensional linear subspace V* ⊂ *H with the following property:*

> *For all unit vectors* ψ ∈ *H, we have* ψ ∈ *V iff* W(*e*ψ)*H* ⊂ *V .*

*Proof.* Pick a basis (υ1,...,υ*k*) of *V* and find unit vectors υ *<sup>i</sup>* ∈ *H* such that υ *i* ∈ W(*e*υ*<sup>i</sup>* )*H*, *i* = 1,..., *k*. Then, using (5.115) we compute

$$|\langle \mathfrak{v}\_l', \mathfrak{v}\_j' \rangle|^2 = \text{Tr}\,(e\_{\mathfrak{v}\_l'} e\_{\mathfrak{v}\_j'}) = \text{Tr}\,(\mathsf{W}(e\_{\mathfrak{v}\_l})\mathsf{W}(e\_{\mathfrak{v}\_j})) = \text{Tr}\,(e\_{\mathfrak{v}\_l} e\_{\mathfrak{v}\_j}) = |\langle \mathfrak{v}\_l, \mathfrak{v}\_j \rangle|^2 = \delta\_{lj},$$

so that the vectors (υ 1,...,υ *<sup>k</sup>*) form an orthonormal set and hence form a basis of their linear span *V* . Now, as mentioned below (B.214), we have ψ ∈ *V* iff ∑*k <sup>i</sup>*=<sup>1</sup> |υ*i*,ψ|<sup>2</sup> <sup>=</sup> 1 and similarly <sup>ψ</sup> <sup>∈</sup>*<sup>V</sup>* iff <sup>∑</sup>*<sup>k</sup> <sup>i</sup>*=<sup>1</sup> |υ *<sup>i</sup>* ,ψ |<sup>2</sup> <sup>=</sup> 1. Since <sup>W</sup> preserves transition probabilities, a computation similar to one just given yields

$$\sum\_{i=1}^{k} |\langle \mathfrak{v}\_{i}, \mathfrak{v} \rangle|^{2} = \sum\_{i=1}^{k} |\langle \mathfrak{v}\_{i}^{\prime}, \mathfrak{v}^{\prime} \rangle|^{2},\tag{5.118}$$

so that both sides do or do not equal unity, and hence ψ ∈ *V* iff ψ ∈ *V* . - Wigner's Theorem for *H* = C<sup>2</sup> (i.e. Theorem 5.7) implies:

Lemma 5.31. *If V and V are related as in Lemma 5.30, and*

$$
\dim(V) = \dim(V') = \mathcal{2},\tag{5.119}
$$

*then there is a unitary or anti-unitary operator uV* : *V* → *V such that*

$$\mathcal{M}(e) = \mu\_V e u\_V^\*,\tag{5.120}$$

*for any one-dimensional projection e* ∈ P1(*V*)*, where* P1(*V*) ⊂ P1(*H*) *consists of all e* ∈ P1(*H*) *with eH* ⊂ *V . Moreover, uV is unique up to a phase.*

*Proof.* A choice of basis for both *V* and *V* gives unitary isomorphisms *u* : *V* <sup>∼</sup><sup>=</sup> <sup>→</sup> <sup>C</sup><sup>2</sup> and *u* : *V* <sup>∼</sup><sup>=</sup> <sup>→</sup> <sup>C</sup>2, which jointly induce a map

$$\mathbb{W}' \equiv \iota' \mathbb{W} \iota^{-1} : \mathcal{P}\_1(\mathbb{C}^2) \to \mathcal{P}\_1(\mathbb{C}^2). \tag{5.121}$$

This maps satisfies the hypotheses of Wigner's Theorem in *d* = 2, and so it is (anti-) unitarily induced as <sup>W</sup> <sup>=</sup> <sup>α</sup>*v*, where *<sup>v</sup>* : <sup>C</sup><sup>2</sup> <sup>→</sup> <sup>C</sup><sup>2</sup> is (anti-) unitary. Then the operator *uV* = (*u* )−1*vu* does the job; its lack of uniqueness stems entirely from *v*. -

Lemma 5.32. *Given a Wigner symmetry* W*, the ensuing operator uV is either unitary or anti-unitary for all two-dimensional subspaces V* ⊂ *H (simultaneously).*

*Proof.* We first design a "unitarity test" for W. Define a function

$$T: \mathcal{P}\_1(H) \times \mathcal{P}\_1(H) \times \mathcal{P}\_1(H) \to \mathbb{C}; \tag{5.122}$$

$$T(e, f, \mathbf{g}) = \text{Tr}(ef\mathbf{g}),\tag{5.123}$$

$$T(e\_{\Psi\_1}, e\_{\Psi\_2}, e\_{\Psi\_3}) = \langle \Psi\_1, \Psi\_2 \rangle \langle \Psi\_2, \Psi\_3 \rangle \langle \Psi\_3, \Psi\_1 \rangle. \tag{5.124}$$

Let *V* ⊂ *H* be two-dimensional and pick an orthonormal basis (υ1,υ2). Define

$$\mathcal{X}\_1 = \mathfrak{v}\_1,\ \mathfrak{X}\_2 = (\mathfrak{v}\_1 - \mathfrak{v}\_2)/\sqrt{2},\ \mathfrak{X}\_3 = (\mathfrak{v}\_1 - i\mathfrak{v}\_2)/\sqrt{2}.\tag{5.125}$$

A simple computation then shows that

$$T(e\_{\mathcal{X}\_1}, e\_{\mathcal{X}\_2}, e\_{\mathcal{X}\_3}) = \frac{1}{4}(1+i). \tag{5.126}$$

It follows from (5.124) that for *u* unitary and *v* anti-unitary, we have

$$T(e\_{\mu\Psi\_1}, e\_{\mu\Psi\_2}, e\_{\mu\Psi\_3}) = T(e\_{\Psi\_1}, e\_{\Psi\_2}, e\_{\Psi\_3});\tag{5.127}$$

$$T(e\_{\nu\Psi\_1}, e\_{\nu\Psi\_2}, e\_{\nu\Psi\_3}) = \overline{T(e\_{\Psi\_1}, e\_{\Psi\_2}, e\_{\Psi\_3})}.\tag{5.128}$$

Eq. (5.126) implies that if W : *V* → *V* is (anti-) unitarily implemented, we have

$$T(\mathsf{W}(e\_{\mathbb{Z}1}), \mathsf{W}(e\_{\mathbb{Z}2}), \mathsf{W}(e\_{\mathbb{Z}3})) = T(e\_{u\mathbb{Z}1}, e\_{u\mathbb{Z}2}, e\_{u\mathbb{Z}3}) = \frac{1}{4}(1 \pm i),\tag{5.129}$$

with a plus sign if *u* is unitary and a minus sign if *u* is anti-unitary. Now take a second pair (*V*˜ ,*V*˜ ) as above, and pick a basis (υ˜1,υ˜2) of *V*˜ , with associated vectors (χ˜1,χ˜2,χ˜3), as in (5.125). Suppose *u* : *V* → *V* implementing W is unitary, whereas *<sup>u</sup>*˜ : *<sup>V</sup>*˜ <sup>→</sup> *<sup>V</sup>*˜ implementing <sup>W</sup> is anti-unitary. It then follows from (5.129) that

$$T(\mathsf{W}(e\_{\mathsf{Z}1}), \mathsf{W}(e\_{\mathsf{Z}2}), \mathsf{W}(e\_{\mathsf{Z}3})) = T(e\_{u\mathsf{Z}1}, e\_{u\mathsf{Z}2}, e\_{u\mathsf{Z}3}) = \frac{1}{4}(1+i);\tag{5.130}$$

$$T(\mathsf{W}(e\_{\mathcal{Z}1}), \mathsf{W}(e\_{\mathcal{Z}2}), \mathsf{W}(e\_{\mathcal{Z}3})) = T(e\_{i\mathcal{Z}1}, e\_{i\mathcal{Z}2}, e\_{i\mathcal{Z}3}) = \frac{1}{4}(1 - i). \tag{5.131}$$

In view of (C.637), the following expression defies a metric *d* on P1(*H*):

$$d(e\_{\Psi}, e\_{\Phi}) = ||\mathfrak{o}\_{\Psi} - \mathfrak{o}\_{\Psi}|| = ||e\_{\Psi} - e\_{\Phi}||\_1 = 2\sqrt{1 - |\langle \mathfrak{o}, \Psi \rangle|^2},\tag{5.132}$$

with respect to which both W and *T* are continuous (the latter with respect to the product metric on <sup>P</sup>1(*H*)3, of course). Let *<sup>t</sup>* → (υ1(*t*),υ2(*t*)) be a continuous path of orthonormal vectors (i.e., in *H* ×*H*), with associated vectors (χ1(*t*),χ2(*t*),χ3(*t*)), as in (5.125). Then the function *f*(*t*) = *T*(W(χ1(*t*)),W(χ2(*t*)),W(χ3(*t*))) is continuous, and by (5.129) it can only take the values <sup>1</sup> <sup>4</sup> (1±*i*). Hence *f*(*t*) must be constant. However, taking a path such that (υ1(0),υ2(0)) = (υ1,υ2) and (υ1(1),υ2(1)) = (υ˜1,υ˜2), gives *f*(0) = <sup>1</sup> <sup>4</sup> (1+*i*) and *f*(1) = <sup>1</sup> <sup>4</sup> (1−*i*), which is a contradiction. -

#### Lemma 5.33. *Wigner's Theorem holds for three-dimensional Hilbert spaces.*

*Proof.* Let (υ1,υ2,υ3) be some basis of of *H* (like the usual basis of *H* = C3). We first show that if W is the identity if restricted to both span(υ1,υ2) and span(υ1,υ3), then W is the identity on *H* altogether. To this end, take ψ = ∑*<sup>i</sup> ci*υ*i*, initially with *c*<sup>1</sup> ∈ R\{0}. Take a unit vector ψ ∈ W(*e*ψ), with ψ = ∑*<sup>i</sup> c i* υ*i*. By the first assumption on W we have |υ,ψ | = |υ,ψ| for any unit vector υ ∈ span(υ1,υ2). Taking

$$\mathfrak{v} = \mathfrak{v}\_{\mathrm{l}}, \qquad \mathfrak{v} = \mathfrak{v}\_{\mathrm{2}}, \qquad \mathfrak{v} = (\mathfrak{v}\_{\mathrm{l}} + \mathfrak{v}\_{\mathrm{2}})/\sqrt{2}, \qquad \mathfrak{v} = (\mathfrak{v}\_{\mathrm{l}} + i\mathfrak{v}\_{\mathrm{2}})/\sqrt{2}, \qquad (5.133)$$

gives the equations

$$|c\_1'| = |c\_1|, \quad |c\_2'| = |c\_2|, \quad |c\_1' + c\_2'| = |c\_1 + c\_2|, \quad |c\_1' - \mathrm{i}c\_2'| = |c\_1 - \mathrm{i}c\_2|, \tag{5.134}$$

respectively. By a choice of phase we may and will assume *c* <sup>1</sup> = *c*1, in which case the only solution is *c*<sup>2</sup> = *c* <sup>2</sup> (geometrically, the solution *c* <sup>2</sup> lies in the intersection of three different circles in the complex plane, which is either empty or consists of a single point). Similarly, the second assumption on W gives *c*<sup>3</sup> = *c* <sup>3</sup>, whence ψ = ψ. The case *c*<sup>1</sup> = 0 may be settled by a straightforward limit argument, since inner products (and hence their absolute values) are continuous on *H* ×*H*.

Given a Wigner symmetry W : P1(*H*) → P1(*H*), we now construct *u* as follows.


is a Wigner symmetry. Clearly, W1(*e*υ*<sup>i</sup>* ) = *e*υ*<sup>i</sup>* (*i* = 1,2,3), so that W<sup>1</sup> maps P1(*H*(<sup>12</sup>)) to itself, where *H*(12) ≡ span(υ1,υ2). Hence Lemma 5.31 gives a unitary map ˜*u*<sup>1</sup> : *H*(12) → *H*(12) such that the restriction of W<sup>1</sup> to *H*(12) is α*u*˜1 .


$$\mathsf{W}\_{\mathfrak{I}} = \mathfrak{a}\_{\mathfrak{u}\_{\mathfrak{I}}} \circ \mathsf{W}\_{\mathfrak{I}} = \mathfrak{a}\_{\mathfrak{u}\_{\mathfrak{I}}} \circ \mathfrak{a}\_{\mathfrak{u}\_{\mathfrak{I}}} \circ \mathfrak{a}\_{\mathfrak{u}\_{\mathfrak{I}}} \circ \mathsf{W},\tag{5.135}$$

which by construction is the identity on both P1(*H*(<sup>12</sup>)) and P1(*H*(<sup>13</sup>)), and so by the first part of the proof it must be the identity on all of P1(*H*). Hence

$$\mathcal{W} = \mathfrak{a}\_{\mathfrak{u}\_1^{-1}} \circ \mathfrak{a}\_{\mathfrak{u}\_2^{-1}} \circ \mathfrak{a}\_{\mathfrak{u}\_3^{-1}} = \mathfrak{a}\_{\mathfrak{u}} \qquad \qquad (\mathfrak{u} = \mathfrak{u}\_1^{-1} \mathfrak{u}\_2^{-1} \mathfrak{u}\_3^{-1}). \qquad \Box$$

Lemma 5.34. *As in Lemma 5.30, if* dim(*V*) = dim(*V* ) = 3*, then there is a unitary or anti-unitary operator uV* : *V* → *V such that* W(*e*) = *uV eu*<sup>∗</sup> *<sup>V</sup> for any e* ∈ P1(*V*)*,*

*Proof.* Given Lemma 5.33, the proof is practically the same as for Lemma 5.31. -

We now finish the proof of Wigner's Theorem. We assume that the outcome of Lemma 5.32 is that each *uV* is unitary; the anti-unitary case requires obvious modifications of the argument below. The first step is, of course, to define *u*(λψ) = λ*u*ψ, λ ∈ C (so this would have been λ*u*ψ in the anti-unitary case). Let ϕ ∈ *H* be linearly independent of ψ and consider the two-dimensional space *V* spanned by ψ and ϕ. Define *u*(ϕ) = *uV* ϕ. With (5.117), this defines *u* on all of *H*. To prove that *u* is linear, take ϕ<sup>1</sup> and ϕ<sup>2</sup> linearly independent of each other and of ψ, so that the linear span *V*<sup>3</sup> of ψ, ϕ1, and ϕ<sup>2</sup> is three-dimensional. Let *Vi* be the two-dimensional linear span of ψ and ϕ*i*, *i* = 1,2. Then *u*ϕ*<sup>i</sup>* = *uVi* ϕ*i*, where the phase of *uVi* is fixed by (5.117). Let *w* : *V*<sup>3</sup> → *V* <sup>3</sup> be the unitary that implements W according to Lemma 5.33.2, with phase determined by (5.117). Since *uV*<sup>1</sup> and *uV*<sup>2</sup> and *w* are unique up to a phase and this phase has been fixed for each in the same way, we must have *uV*<sup>1</sup> = *w*|*V*<sup>1</sup> and *uV*<sup>2</sup> = *w*|*V*<sup>2</sup> . Finally, we have *V*<sup>12</sup> spanned by ψ and ϕ<sup>1</sup> +ϕ2, and by the same token, *uV*<sup>12</sup> = *w*|*V*<sup>12</sup> . Now *w* is unitary and hence linear, so

$$\begin{aligned} \mu(\mathfrak{q}\_1 + \mathfrak{q}\_2) &= \mu\_{V\_{12}}(\mathfrak{q}\_1 + \mathfrak{q}\_2) = \mathfrak{w}(\mathfrak{q}\_1 + \mathfrak{q}\_2) = \mathfrak{w}(\mathfrak{q}\_1) + \mathfrak{w}(\mathfrak{q}\_2), \\ &= \mu\_{V\_1}(\mathfrak{q}\_1) + \mu\_{V\_2}(\mathfrak{q}\_2) = \mathfrak{u}(\mathfrak{q}\_1) + \mathfrak{u}(\mathfrak{q}\_2), \end{aligned}$$

since this is how *u* was defined. Since each *uV* is unitary, so is *u*, and similarly it is easy to verify that *u* implements W, because each *uV* does so. -

#### 5.6 Some abstract representation theory

Since all symmetries we have considered (named after Wigner, Kadison, Jordan, Ludwig, von Neumann, and Bohr) are implemented by either unitary or anti-unitary operators, which are determined (by the given symmetry) only up to a phase *z* ∈ T, the quantum-mechanical symmetry group G *<sup>H</sup>* of a Hilbert space *H* is given by

$$\mathcal{H}^{H} = (U(H) \cup U\_{a}(H)) / \mathbb{T},\tag{5.136}$$

where *U*(*H*) is the group of unitary operators on *H*, and *Ua*(*H*) is the set of antiunitary operators on *H*; the latter is not a group (since the product of two antiunitaries is unitary) but their union is. Furthermore, T is identified with the normal subgroup T ≡ T·1*<sup>H</sup>* = {*z*·1*<sup>H</sup>* | *z* ∈ T} of *U*(*H*)∪*Ua*(*H*) (and also of *U*(*H*)) consisting of multiples of the unit operators by a phase; thus the quotient G *<sup>H</sup>* is a group.

The fact that G *<sup>H</sup>* rather than *U*(*H*) is the symmetry group of quantum mechanics has profound consequences (one of which is our very existence), which we will study from §5.10 onwards. However, this material relies on the theory of "ordinary" (i.e., non-projective) unitary representations, which we therefore review first.

Namely, let *G* be a group. In mathematics, the natural kind of action of *G* on a Hilbert space *H* is a *unitary representation*, i.e., a homomorphism

$$
u: G \to U(H),\tag{5.137}$$

so that *u*(*x*)−<sup>1</sup> = *u*(*x*−1) = *u*(*x*)<sup>∗</sup> and *u*(*x*)*u*(*y*) = *u*(*xy*), which imply *u*(*e*) = 1*H*.

As to the possible continuity properties of unitary representations in case that *G* is a *topological* group (i.e., a group *G* that is also a topological space, such that group multiplication *G* × *G* → *G* and inverse *G* → *G* are continuous), one should equip *U*(*H*) with the *strong* operator topology (as opposed to the norm topology).

Proposition 5.35. *If u* : *x* → *u*(*x*) *is a unitary representation of some locally compact group G on a Hilbert space H, then the following conditions are equivalent:*


*Proof.* Strong continuity means that if *x*<sup>λ</sup> → *x* in *G*, then for each ψ ∈ *H* we have (*u*(*x*<sup>λ</sup> )−*u*(*x*))ψ → 0. This is clearly implied by the first kind of continuity, giving 1 ⇒ 2, so let us prove the nontrivial converse. Suppose *x*<sup>λ</sup> → *x* and ψμ → ψ; since *G* is locally compact, *x* has a compact neighborhood *K* and we may assume that each *x*<sup>λ</sup> ∈ *K*. If *u* is strongly continuous, then for any ϕ ∈ *H* the set {*u*(*y*)ϕ, *y* ∈ *K*} is compact in *H* and hence bounded. The Banach–Steinhaus Theorem B.78 gives boundedness of the corresponding operator norms, that is, {*u*(*y*), *y* ∈ *K*} <*CK* for some *CK* > 0. We now estimate

$$\left| \left| \left| \mu(\mathbf{x}\_{\lambda}) \Psi \mu - \mu(\mathbf{x}) \Psi \right| \right| \leq \left| \left| \mu(\mathbf{x}\_{\lambda}) \Psi \mu - \mu(\mathbf{x}\_{\lambda}) \Psi \right| \right| + \left| \left| \left( \mu(\mathbf{x}\_{\lambda}) - \mu(\mathbf{x}) \right) \Psi \right| \right|.$$

The first term vanishes as ψμ → ψ since it is bounded by *CK*ψμ −ψ, whereas the second vanishes as *x*<sup>λ</sup> → *x* by the (assumed) strong continuity of *u*. - Since the first kind of continuity is the usual one for group actions, this justifies the choice of strong continuity as the natural one for unitary representations (to which a pragmatic point may be added: norm continuity is quite rare for unitary representations on infinite-dimensional Hilbert spaces). Things further simplify under mild restrictions on *G* and *H*, which are satisfied in all examples of physical interest.

Proposition 5.36. *If H is separable and G is second countable locally compact (sclc), then each of the two continuity conditions in Proposition 5.35 is in turn equivalent to* weak measurability *of u, in that for each* ϕ,ψ ∈ *H the function*

$$\alpha \mapsto \langle \mathfrak{q}\mathfrak{p}, \mathfrak{u}(\mathfrak{x})\mathfrak{y} \rangle$$

*from G to* C *is (Borel) measurable.*

*Proof.* This spectacular result is due to von Neumann, who more generally proved that a measurable homomorphism between sclc groups is continuous. This implies the claim: first, if *H* is separable, then the group *U*(*H*) is sclc in its weak operator topology, so that if the map *G* → *U*(*H*), *x* → *u*(*x*) is weakly measurable, then it is continuous in the weak topology on *U*(*H*). Second, for any Hilbert space, weak (operator) continuity of a unitary representation implies strong continuity (so that, given the trivial converse, weak and strong continuity of unitary group representations are equivalent). We only prove this last claim: for *x*, *y* ∈ *G*, we compute

$$\begin{aligned} \| |(\boldsymbol{\mu}(\mathbf{y}) - \boldsymbol{\mu}(\mathbf{x})) \boldsymbol{\Psi} \| &= \| |\boldsymbol{\mu}(\mathbf{x}) \boldsymbol{\Psi} \| ^2 + \| \boldsymbol{\mu}(\mathbf{y}) \boldsymbol{\Psi} \| ^2 - \langle \boldsymbol{\mu}(\mathbf{x}) \boldsymbol{\Psi}, \boldsymbol{\mu}(\mathbf{y}) \boldsymbol{\Psi} \rangle - \langle \boldsymbol{\mu}(\mathbf{y}) \boldsymbol{\Psi}, \boldsymbol{\mu}(\mathbf{x}) \boldsymbol{\Psi} \rangle \\ &= 2 \| \boldsymbol{\Psi} \| ^2 - \langle \boldsymbol{\Psi}, \boldsymbol{\mu}(\mathbf{x}^{-1} \mathbf{y}) \boldsymbol{\Psi} \rangle - \langle \boldsymbol{\Psi}, \boldsymbol{\mu}(\mathbf{y}^{-1} \mathbf{x}) \boldsymbol{\Psi} \rangle, \end{aligned}$$

Weak continuity obviously implies that the function *x* → ψ,*u*(*x*)ψ is continuous at the identity *e* ∈ *G*, so if *y* = *x*<sup>λ</sup> → *x*, then (*u*(*x*<sup>λ</sup> )−*u*(*x*))ψ → 0. -

In view of this, it is hardly a restriction for a unitary representation of a locally compact group on a Hilbert space to be continuous in the sense of Proposition 5.35, so we always assume this in what follows. Furthermore, any group we consider is locally compact, so this will be a standing assumption, too. An important consequence of this assumption is the existence of a translation-invariant measure on *G*.

Theorem 5.37. *Each locally compact group G has a canonical nonzero (outer regular Borel) measure* μ*, called* Haar measure*, which is left-invariant in that*

$$\int\_{G} d\mu(\mathbf{x}) L\_{\mathfrak{Y}} f(\mathbf{x}) = \int\_{G} d\mu(\mathbf{x}) f(\mathbf{x}),\tag{5.138}$$

*for each f* ∈ *Cc*(*G*) *and y* ∈ *G, where the* left translation *Ly of f by y is defined by*

$$L\_{\mathbf{\mathcal{Y}}}f(\mathbf{x}) = f(\mathbf{y}^{-1}\mathbf{x}).\tag{5.139}$$

*This measure is unique up to scalar multiplication. Moreover, if G is compact, then: 1.* μ *is finite and hence can be normalized to a probability measure, i.e.,*

$$
\mu(G) = 1.\tag{5.140}
$$

#### *2.* μ *is also right-invariant in that*

$$\int\_{G} d\mu(\mathbf{x}) \, R\_{\mathbf{y}} f(\mathbf{x}) = \int\_{G} d\mu(\mathbf{x}) f(\mathbf{x}),\tag{5.141}$$

*where the* right translation *Ry of f by y* ∈ *G is defined by*

$$R\_{\circ}f(\mathbf{x}) = f(\mathbf{x}\mathbf{y}).\tag{5.142}$$

*3.* μ *is invariant under inversion, in that*

$$\int\_{G} d\mu(\mathbf{x}) \, f(\mathbf{x}^{-1}) = \int\_{G} d\mu(\mathbf{x}) \, f(\mathbf{x}).\tag{5.143}$$

Existence is due to Haar and uniqueness was first proved by von Neumann. One often writes *dx* ≡ *d*μ(*x*) for Haar measure. Here are some examples:


$$\int\_{\mathbb{T}} d\mu(z)f(z) = \frac{1}{2\pi} \int\_0^{2\pi} d\theta \, f(e^{i\theta}).\tag{5.144}$$

• For *G* = *GLn*(R) with *X* = (*xi j*), we have

$$d\mu(X) = \prod\_{i,j=1}^{m} d\chi\_{ij} |\det(X)|^{-n},\tag{5.145}$$

which for *G* = *SLn*(R) of course simplifies to *d*μ(*X*) = ∏*i*, *<sup>j</sup> dxi j*.

Definition 5.38. *A unitary representation u of a group G on a Hilbert space H is* irreducible *if the only closed subspaces K of H that are stable under u*(*G*) *(in the sense that if* ψ ∈ *K, then u*(*x*)ψ ∈ *K for all x* ∈ *G) are either K* = *H or K* = {0}*.*

We will often need two important results about irreducibility. The first is *Schur's Lemma*, in which the commutant *S* of some subset *S* ⊂ *B*(*H*) is defined by

$$S' = \{ a \in B(H) \mid ab = ba \,\forall b' \in S \}. \tag{5.146}$$

Lemma 5.39. *A unitary representation u of a group G is irreducible iff*

$$
\mu(G)' = \mathbb{C} \cdot 1,\tag{5.147}
$$

*i.e., if au*(*x*) = *u*(*x*)*a for each x* ∈ *G implies a* = λ · 1*<sup>H</sup> for some* λ ∈ C*.*

This follows from Theorem C.90, of which the above lemma is a special case: take *A* = *u*(*G*) ≡ (*u*(*G*) ) . The second is part of the *Peter–Weyl Theorem*.

Theorem 5.40. *Irreducible representations of compact groups are finite-dimensional.*

*Proof.* We first reduce the situation to the unitary case: if ·,·, is the given inner product on *H*, we define a new inner product ·,·, by averaging with respect to Haar measure *dx* ≡ *d*μ(*x*), i.e.,

$$
\langle \Psi | \Psi \rangle = \int\_G dx \, \langle \mu(\mathbf{x}) \Psi | \mu(\mathbf{x}) \Phi \rangle. \tag{5.148}
$$

Using (5.141), it is easy to verify that this new inner product makes *u* unitary.

So let *u* : *G* → *u*(*H*) be an irreducible unitary representation. For each unit vector ϕ ∈ *H* and *x* ∈ *G*, we define the following projection and its *G*-average:

$$e\_{\mathfrak{u}(\mathfrak{x})\mathfrak{g}} = |\mathfrak{u}(\mathfrak{x})\mathfrak{g}\rangle\langle\mathfrak{u}(\mathfrak{x})\mathfrak{g}|,\tag{5.149}$$

$$W\_{\mathfrak{P}} = \int\_{G} d\mathbf{x} \, e\_{\mathfrak{u}(\mathfrak{x})\mathfrak{P}}.\tag{5.150}$$

The *Weyl operator* (5.150) is initially defined as a quadratic form by

$$
\langle \Psi\_{\mathsf{I}}, W\_{\mathsf{P}} \Psi\_{\mathsf{I}} \rangle = \int\_{G} d\mathbf{x} \, \langle \Psi\_{\mathsf{I}}, e\_{\mathsf{u}(\mathbf{x}) \oplus} \Psi\_{\mathsf{I}} \rangle. \tag{5.151}
$$

The integral exists because the integrand is continuous and bounded, defining a *bounded* quadratic form by the estimate |ψ1,*W*ϕψ2| ≤ ψ1ψ2, where we assumed (5.140) and used *eu*(*x*)ϕ = 1, as (5.149) is a nonzero projection. Thus the operator *W*<sup>ϕ</sup> may be reconstructed from its matrix elements (5.151), cf. Proposition B.79. It is easy to verify that [*W*ϕ,*u*(*y*)] = 0 for each *y* ∈ *G*, so that Schur's Lemma yields *<sup>W</sup>*<sup>ϕ</sup> <sup>=</sup> λϕ · <sup>1</sup>*<sup>H</sup>* for some λϕ <sup>∈</sup> <sup>C</sup>. Hence ψ,*W*ϕψ <sup>=</sup> λϕψ2, in other words,

$$\int\_{G} dx \, |\langle \Psi, u(\mathbf{x}) \Phi \rangle|^2 = \lambda\_{\Phi} ||\Psi||^2. \tag{5.152}$$

If we now interchange <sup>ϕ</sup> and <sup>ψ</sup> and use (5.143) we find λϕψ<sup>2</sup> <sup>=</sup> λψϕ2, so that, taking ψ to be a unit vector, too, since ψ and ϕ are arbitrary we obtain λϕ = λψ ≡ λ, where in fact λ > 0, as follows by taking ψ = ϕ in (5.152). Finally, take *n* orthornormal vectors (υ1,...,υ*n*) in *H*, so that also (*u*(*x*)υ1,...,*u*(*x*)υ*n*) are orthonormal (since *u*(*x*) is unitary), upon which Bessel's inequality (B.212) gives

$$\sum\_{i=1}^{n} |\langle \Psi, \mu(\mathbf{x})\mathbf{v}\_{i} \rangle|^{2} \le ||\Psi||^{2}.\tag{5.153}$$

Integrating both sides over *G*, taking ψ = 1, and using (5.140) gives

$$\sum\_{i=1}^{n} \int\_{G} d\mathbf{x} \left| \langle \Psi, \mu(\mathbf{x}) \mathbf{v}\_{i} \rangle \right|^{2} \leq 1. \tag{5.154}$$

On the other hand, summing (5.152) over *i* simply yields *n*λ, whence *n*λ ≤ 1, for any *n* ≤ dim(*H*). Since λ > 0 this forces dim(*H*) < ∞. -

#### 5.7 Representations of Lie groups and Lie algebras

We now assume that *G* is a *Lie group*; as in §3.3, for our purposes we may restrict ourselves to *linear* Lie groups, i.e. closed subgroups of *GLn*(K) for K = R or C.

Let *u* : *G* → *U*(*H*) be a unitary representation of a Lie group *G* on some Hilbert space *H* (assumed strongly continuous). If *H* is finite-dimensional, the following operation is unproblematic: for *<sup>A</sup>* <sup>∈</sup> g (i.e. the Lie algebra of *<sup>G</sup>*) we define an operator

$$
u'(A) \;:\; H \to H;\tag{5.155}$$

$$
\mu'(A) = \frac{d}{dt} \mu \left( e^{tA} \right)\_{|t=0}. \tag{5.156}
$$

This gives a linear map *<sup>u</sup>* : <sup>g</sup> <sup>→</sup> *<sup>B</sup>*(*H*), which satisfies

$$[\mu'(A), \mu'(B)] = \mu'([A, B]);\tag{5.157}$$

$$
\mu'(A)^\* = -\mu'(A). \tag{5.158}
$$

Note that physicists use Planck's constant *h*¯ > 0 and like to write

$$
\pi(A) = i\hbar u'(A),\tag{5.159}
$$

so that one has the following commutation relations and self-adjointness condition:

$$[\pi(A), \pi(B)] = i\hbar \pi([A, B]);\tag{5.160}$$

$$
\pi(A)^\* = \pi(A). \tag{5.161}
$$

If one knows that *<sup>u</sup>* : g <sup>→</sup> *<sup>B</sup>*(*H*) comes from *<sup>u</sup>* : *<sup>G</sup>* <sup>→</sup> *<sup>U</sup>*(*H*), one conversely has

$$
\mu(e^A) = e^{\mu'(A)} = e^{-\frac{\not}{\hbar}\pi(A)}.\tag{5.162}
$$

More generally, we call a map <sup>ρ</sup> : <sup>g</sup> <sup>→</sup> *<sup>B</sup>*(*H*) (where *<sup>H</sup>* <sup>∼</sup><sup>=</sup> <sup>C</sup>*<sup>n</sup>* remains finitedimensional, so that <sup>ρ</sup> : g <sup>→</sup> *Mn*(C)), a *skew-adjoint* representation of <sup>g</sup> on *<sup>H</sup>* if

$$[\rho(A), \rho(B)] = \rho([A, B]);\tag{5.163}$$

$$
\mathfrak{p}(A)^\* = -\mathfrak{p}(A). \tag{5.164}
$$

The property of irreducibility of such a representation <sup>ρ</sup> : g <sup>→</sup> *<sup>B</sup>*(*H*) is defined in the same way as for groups, namely that the only linear subspaces of *H* ∼= C*<sup>n</sup>* that are stable under <sup>ρ</sup>(g) are {0} and *<sup>H</sup>*. Equivalently, by Schur's Lemma, <sup>ρ</sup>(g) is irreducible iff the only operators that commute with all π(*A*) are multiples of the unit operator. If ρ = *u* for some unitary representation *u*(*G*), it is easy to see that *u* is irreducible iff *u* is irreducible. In view of this, it is a reasonable strategy to try and construct irreducible unitary representations *u*(*G*) by starting, as it were, from *u* (g). More precisely, if <sup>ρ</sup> is some (irreducible) skew-adjoint representation of g, we may ask if there is a (necessarily irreducible) unitary representation *u*(*G*) such that ρ = *u* . Writing exp(ρ) for *u*, one would therefore hope that

156 5 Symmetry in quantum mechanics

$$
\mu\left(e^A\right) \equiv e^{\mathcal{P}}\left(e^A\right) = e^{\mathcal{P}(A)},\tag{5.165}
$$

as in (5.162). Note that if *G* is connected, then ρ duly defines *u*(*x*) for each *x* ∈ *G* through (5.165), since by Lie theory every element *x* of a connected Lie group is a finite product *<sup>x</sup>* <sup>=</sup> exp(*A*1)···exp(*An*) of exponentials of elements (*A*1,...,*An*) of g.

In general, this hope is in vain, since although each operator exp(*A*) is unitary, the representation property *u*(*x*)*u*(*y*) = *u*(*xy*) may fail for global reasons. For example, if *<sup>G</sup>* <sup>=</sup> *SO*(3), then <sup>g</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup>3, with basis (*J*1, *<sup>J</sup>*2, *<sup>J</sup>*3), as in (3.66). Define an *a priori* linear map <sup>ρ</sup> : g <sup>→</sup> *<sup>M</sup>*2(C) by linear extension of

$$
\mathfrak{p}(J\_k) = -\frac{1}{2}i\mathfrak{o}\_k,\tag{5.166}
$$

where (σ1,σ2,σ3) are the Pauli matrices (5.42), so that physicists would write

$$
\pi(J\_k) = \frac{1}{2}\hbar\sigma\_k,\tag{5.167}
$$

cf. (5.159). This is easily checked to give a skew-adjoint representation of g, but it does not exponentiate to a unitary representation of *SO*(3): as already mentioned after Proposition 5.46, if u is a unit vector in R3, then a rotation *R*<sup>θ</sup> (u) around the u-axis by an angle θ ∈ [0,2π] is represented by

$$\mu(R\_{\theta}(\mathbf{u})) = \cos(\theta/2) \cdot 1\_2 + i \sin(\theta/2) \mathbf{u} \cdot \boldsymbol{\sigma}.\tag{5.168}$$

Consequently, *<sup>u</sup>*(*R*<sup>π</sup> (u)) = *<sup>i</sup>*<sup>u</sup> ·σ, so that *<sup>u</sup>*(*R*<sup>π</sup> (u))<sup>2</sup> <sup>=</sup> <sup>−</sup>12, although within *SO*(3) one has *<sup>R</sup>*<sup>π</sup> (u)<sup>2</sup> <sup>=</sup> *<sup>e</sup>*, the unit of *SO*(3), so that *<sup>u</sup>*(*R*<sup>π</sup> (u))<sup>2</sup> <sup>=</sup> *<sup>u</sup>*(*R*<sup>π</sup> (u)2).

However, ρ does exponentiate to a representation of *SU*(2), which happens to be the universal covering group of *SO*(3). This is typical of the general situation, which we state without proofs. We first need a refinement of *Lie's Third Theorem*:

Theorem 5.41. *Let G be a connected Lie group G with Lie algebra* g*. There exists a simply connected Lie group G, unique up to isomorphism, such that:* ˜


For example, for *G* = *SO*(3) we have *G*˜ = *SU*(2) and *D* = Z2, cf. Proposition 5.46.

Theorem 5.42. *Let G*<sup>1</sup> *and G*<sup>2</sup> *be Lie groups, with Lie algebras* <sup>g</sup><sup>1</sup> *and* <sup>g</sup>2*, respectively, and suppose that G*<sup>1</sup> *is simply connected. Then every Lie algebra homomorphism* <sup>ϕ</sup> : <sup>g</sup><sup>1</sup> <sup>→</sup> <sup>g</sup><sup>2</sup> *comes from a unique Lie group homomorphism* <sup>Φ</sup> : *<sup>G</sup>*<sup>1</sup> <sup>→</sup> *<sup>G</sup>*<sup>2</sup> *through* ϕ = Φ *, where (realizing G*<sup>1</sup> *and G*<sup>2</sup> *as matrices)*

$$\Phi'(X) = \frac{d}{dt}\Phi\left(e^{tX}\right)\_{|t=0}.\tag{5.169}$$

Let *H* be a finite-dimensional Hilbert space, so that *B*(*H*) ∼= *Mn*(C), where *n* = dim(*H*), and take *U*(*H*) ∼= *Un*(C) to be the group of all unitary matrices on C*n*. The Lie algebra u*n*(C) of *Un*(C) consists of all skew-adjoint *<sup>n</sup>* <sup>×</sup> *<sup>n</sup>* complex matrices. Since irreducibility is preserved under the correspondence *u*(*G*) ↔ *u* (g), we infer:

Corollary 5.43. *Let G be a simply connected Lie group with Lie algebra* g*. Any finite-dimensional skew-adjoint representation* <sup>π</sup> : g <sup>→</sup> u*n*(C) *of* <sup>g</sup> *comes from a unique unitary representation u*(*G*) *through* (5.156)*, in which case we have*

$$e^{\mu'(A)} = \mu\left(e^A\right) \ (A \in \mathfrak{g}).\tag{5.170}$$

*Thus there is a bijective correspondence between finite-dimensional unitary representations of G and finite-dimensional skew-adjoint representations of* g*. In particular, if G is compact, this specializes to a bijective correspondence between unitary irreducible representations of G and skew-adjoint irreducible representations of* g*.*

*If G* ∼= *G*˜/*D is connected but not simply connected, then a finite-dimensional skew-adjoint representation* <sup>ρ</sup> : g <sup>→</sup> *<sup>B</sup>*(*H*) *exponentiates to a unitary representation <sup>u</sup>* : *<sup>G</sup>* <sup>→</sup> *<sup>U</sup>*(*H*) *iff the representation* exp(ρ) : *<sup>G</sup>*˜ <sup>→</sup> *<sup>U</sup>*(*H*) *is trivial on D.*

For example, *G* = *SO*(3), the last condition is satisfied for the irreducible representations with integer spins *j* ∈ N (as well as for *j* = 0), see §5.8.

A similar construction is possible when *H* is infinite-dimensional, except for the fact that the derivative in (5.156) may not exist. For example, *G* = R has its canonical regular representation on *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(R), defined by *<sup>u</sup>*(*a*)ψ(*x*) = <sup>ψ</sup>(*x*−*a*), in which case (5.159) gives some multiple of the momentum operator −*ihd*¯ /*dx*. This operator is unbounded and hence is not defined on all of *H*, see also §5.11 and §5.12. As in Stone's Theorem 5.73, this problem is solved by finding a suitable domain in *H* on which the underlying limit, taken strongly, does exist. This is the *Garding domain ˚*

$$D\_G = \left\{ \mu^\int (f)\,\Psi, f \in \mathcal{C}\_c^\infty(G), \,\Psi \in H \right\},\tag{5.171}$$

where for each *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G*) (or even *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*1(*G*)) the operator *<sup>u</sup>* (*f*) is defined by

$$
\mu^{\int}(f) = \int\_G d\mathbf{x} f(\mathbf{x}) \mu(\mathbf{x}).\tag{5.172}
$$

Like the derivative *u* , this integral is most easily defined weakly, i.e., the (bounded) operator *u* (*f*) is initially defined as a bounded quadratic form

$$Q(\boldsymbol{\upvarphi}, \boldsymbol{\upvarphi}) = \int\_{G} d\boldsymbol{x} \, f(\boldsymbol{x}) \langle \boldsymbol{\upvarphi}, \boldsymbol{\upmu}(\boldsymbol{x}) \boldsymbol{\upvarphi} \rangle,\tag{5.173}$$

from which the operator *u* (*f*) may be reconstructed as in Proposition B.79. Note that the function *x* → ϕ,*u*(*x*)ψ is in *Cb*(*G*), so that the integral (5.173) exists.

It can be shown that *DG* is dense in *H*, as well as *invariant* under *u* (g), in the sense that if ψ ∈ *DG*, then *u* (*A*)<sup>ψ</sup> <sup>∈</sup> *DG* for any *<sup>A</sup>* <sup>∈</sup> <sup>g</sup>. Furthermore, for each <sup>ϕ</sup> <sup>∈</sup> *DG* the function *x* → *u*(*x*)ϕ from *G* to *H* is smooth (if *G* is unimodular this property even characterizes *DG*). The commutation relations (5.157) then hold on *DG*, but the equalities (5.164) do not: one has to choose between (5.157) and (5.164), since the latter holds for the closure of each π(*A*) (i.e., each *i*ρ(*A*) is essentially selfadjoint on *DG*), whose domain however depends on *A*: there is no common domain on which each *i*ρ(*A*) is self-adjoint *and* the commutation relations (5.157) hold.

#### 5.8 Irreducible representations of *SU*(2)

One of the most important groups in quantum physics is *SU*(2), both as an internal symmetry group—e.g. of the Heisenberg model of ferromagnetism, of the weak nuclear interaction, and possibly also of (loop) quantum gravity—and as a spatial symmetry group in disguise (all projective unitary representations of *SO*(3) come from unitary representations of *SU*(2), preserving irreducibility, cf. Corollary 5.61). In this section we review the well-known classification and construction of its unitary irreducible representations. Since *SU*(2) is compact, by Theorem 5.40 all its unitary irreducible representations are finite-dimensional. Since *G* = *SU*(2) is also simply connected, by Corollary 5.43 its irreducible finite-dimensional (unitary) representations *u* bijectively correspond to the irreducible finite-dimensional skew-adjoint representations <sup>ρ</sup> <sup>=</sup> *<sup>u</sup>* of its Lie algebra <sup>g</sup>. Hence our job is to find the latter.

We already encountered the basis (3.66) of the Lie algebra so(3) <sup>∼</sup><sup>=</sup> <sup>R</sup><sup>3</sup> of *SO*(3); the corresponding basis of the Lie algebra su(2) of *SU*(2) is (*S*1,*S*2,*S*3), where

$$S\_k = -\frac{1}{2}i\sigma\_k,\tag{5.174}$$

and the σ*<sup>k</sup>* are the Pauli matrices given in (5.42); linear extension of the map *Jk* → *Sk* defines an isomorphism between so(3) and su(2). These matrices satisfy

$$[\mathbf{S}\_l, \mathbf{S}\_f] = \mathbf{c}\_{ijk}\mathbf{S}\_k,\tag{5.175}$$

where ε*ijk* is the totally anti-symmetric symbol with ε<sup>123</sup> = 1 etc., so that (5.175) comes down to [*S*1,*S*2] = *S*3, [*S*3,*S*1] = *S*2, and [*S*2,*S*3] = *S*1. By linearity, finding ρ is the same as finding *n*×*n* matrices

$$L\_k = \text{i}\mathfrak{p}(\mathbf{S}\_k) \tag{5.176}$$

that satisfy

$$[L\_i, L\_j] = i\varepsilon\_{ijk}L\_k,\tag{5.177}$$

i.e., [*L*1,*L*2] = *iL*3, etc., and

$$L\_k^\* = L\_k.\tag{5.178}$$

It turns out to be convenient to introduce the *ladder operators*

$$L\_{\pm} = L\_1 \pm iL\_2,\tag{5.179}$$

with ensuing commutation relations

$$[L\_3, L\_\pm] = \pm L\_\pm;\tag{5.180}$$

$$[L\_+, L\_-] = 2L\_3. \tag{5.181}$$

Furthermore, we define the *Casimir operator*

$$C = L\_1^2 + L\_2^2 + L\_3^2,\tag{5.182}$$

which, crucially, commutes with each *Lk*, i.e.,

$$[\mathbf{C}, L\_k] = \mathbf{0} \ (k = 1, 2, 3). \tag{5.183}$$

By Schur's lemma, in any irreducible representation we therefore must have

$$C = c \cdot 1\_H,\tag{5.184}$$

where *c* ∈ R (in fact, *c* ≥ 0). We will also use the additional algebraic relations

$$L\_{+}L\_{-}=C-L\_{3}(L\_{3}-1\_{H});\tag{5.185}$$

$$L\_-L\_+ = C - L\_3(L\_3 + 1\_H).\tag{5.186}$$

The simple idea is now to diagonalize *L*3, which is possible as *L*<sup>∗</sup> <sup>3</sup> = *L*3. Hence

$$H = \bigoplus\_{\lambda \in \sigma(L\_3)} H\_{\lambda},\tag{5.187}$$

where σ(*L*3) is the spectrum of *L*<sup>3</sup> (which in this finite-dimensional case consists of its eigenvalues), and *H*<sup>λ</sup> is the eigenspace of *L*<sup>3</sup> for eigenvalue λ (i.e., if υ ∈ *H*<sup>λ</sup> , then *L*3υ = λ υ). The structure of (5.187) in irreducible representations is as follows.

Lemma 5.44. *Let* <sup>ρ</sup> : su(2) <sup>→</sup> *<sup>B</sup>*(*H*) *be a finite-dimensional skew-adjoint irreducible representation, so that* (5.177) *holds. Then the spectrum* σ(*L*3) *of the selfadjoint operator L*<sup>3</sup> = *i*ρ(*S*3) *is given by*

$$\sigma(L\_3) = \{-j, -j+1, \dots, j-1, j\}.\tag{5.188}$$

*If* (5.187) *is the spectral decomposition of H relative to L*3*, then:*


*Proof.* For any λ ∈ σ(*L*3) and nonzero υλ ∈ *H*<sup>λ</sup> , we have:


Indeed, (5.180) gives *L*3(*L*+υλ )=(λ + 1)*L*+υλ , which immediately yields the claim. Similarly, either λ − 1 ∈ σ(*L*3) and *L*−υλ ∈ *H*λ−1, or *L*−υλ = 0. Now let λ<sup>0</sup> = minσ(*L*3) be the smallest eigenvalue of *L*3, and pick some 0 = υλ<sup>0</sup> ∈ *H*λ<sup>0</sup> . Since *H* is finite-dimensional by assumption, there must be some *k* ∈ N<sup>0</sup> = N∪ {0} such that *Lk*+<sup>1</sup> <sup>+</sup> υλ<sup>0</sup> = 0, whereas all vectors *L<sup>l</sup>* <sup>+</sup>υλ<sup>0</sup> for *l* = 0,..., *k* are nonzero (and lie in *H*λ0+*l*). With *c* defined as in (5.184), it then follows from (5.185) - (5.186) that

$$c - \mathbb{A}\_0(\mathbb{A}\_0 - 1) = 0;\tag{5.189}$$

$$c - (\lambda\_0 + k)(\lambda\_0 + k + 1) = 0.\tag{5.190}$$

These relations imply λ<sup>0</sup> = −*k*/2, so that by the above bullet points we also have

$$\{-k/2, -k/2+1, \dots, k/2-1, k/2\} \subseteq \sigma(L\_3). \tag{5.191}$$

To prove equality, as in (5.188), consider the vector space

$$H' = \mathbb{C} \cdot \mathfrak{v}\_{\mathbb{A}\_0} \oplus \mathbb{C} \cdot L\_+ \mathfrak{v}\_{\mathbb{A}\_0} \oplus \dots \oplus L\_+^{k-1} \mathfrak{v}\_{\mathbb{A}\_0} \oplus L\_+^k \mathfrak{v}\_{\mathbb{A}\_0} \subseteq H; \tag{5.192}$$

this is just the subspace of *<sup>H</sup>* with basis (υλ<sup>0</sup> ,*L*+υλ<sup>0</sup> ,...,*Lk*−<sup>1</sup> <sup>+</sup> υλ<sup>0</sup> ,*L<sup>k</sup>* <sup>+</sup>υλ<sup>0</sup> ). By the previous arguments following from (5.180), we see that the operators *L*<sup>+</sup> and *L*<sup>−</sup> never leave *H* , and the same is trivially true for *L*3. Therefore, if ρ is irreducible, then we must have *H* = *H* (and conversely). All claims of the lemma are now trivially verified on *H* . -

It should be clear from this proof that the actions of *L*+, *L*−, and *L*<sup>3</sup> (and hence of all elements of su(2)) on *<sup>H</sup>* <sup>=</sup> *<sup>H</sup>*) are fixed, so that <sup>ρ</sup> is determined by its dimension

$$\dim(H) = 2j + 1,\tag{5.193}$$

from which it follows that *j* can only take the values 0,1/2,1,3/2,....

It remains to fix an inner product on *H* in which ρ is skew-adjoint, i.e., in which *L*∗ <sup>3</sup> = *L*<sup>3</sup> and *L*<sup>∗</sup> <sup>+</sup> = *L*<sup>−</sup> (which implies that *L*<sup>∗</sup> <sup>1</sup> = *L*<sup>1</sup> and *L*<sup>∗</sup> <sup>2</sup> = *L*2, which jointly imply <sup>ρ</sup>(*X*∗) = <sup>−</sup>ρ(*X*) for any *<sup>X</sup>* <sup>∈</sup> g). This may be done in principle by starting with any inner product, integrating ρ to a unitary representation of *SU*(2), and using the construction explained at the beginning of the proof of Theorem 5.40. In practice, it is easier to just calculate: take *H* = C*<sup>n</sup>* with *n* = 2 *j* +1, standard inner product, and standard orthonormal basis (*ul*), labeled as *l* = 0,1,...,2 *j*). Then put

$$L\_3\mu\_l = (l-j)\mu\_l;\tag{5.194}$$

$$L\_{+}\mu\_{l} = \sqrt{(l+1)(n-l-1)}\mu\_{l+1};\tag{5.195}$$

$$L\_{-}u\_{l} = \sqrt{l(n-l)}u\_{l-1}.\tag{5.196}$$

Note that (5.195) is even formally correct for *l* = 2 *j*, since in that case *n*−2 *j*−1 = 0, and similarly, (5.196) formally holds even for *l* = 0. The commutation relations (5.180) - (5.181) as well as the above conditions for skew-adjointness may be explicitly verified, from which it follows that for any prescribed dimension (5.193) we have found a skew-adjoint realization of ρ. Clearly, *ul* = υ*l*−*j*.

In view of Theorem 5.40 and Corollary 5.43 we have therefore proved:

Theorem 5.45. *Up to unitary equivalence, any (unitary) irreducible representation of SU*(2) *is completely determined by its dimension n* = dim(*H*)*, and any dimension n* ∈ N<sup>0</sup> = N∪ {0} *occurs. Furthermore, if j is the number in* (5.188)*, we have*

$$n = 2j + 1.\tag{5.197}$$

Physicists typically label these irreducible representations by *j* (called the *spin* of the given representation) rather than by *n*, or even by *c* = *j*(*j* +1), cf. (5.184).

#### 5.8 Irreducible representations of *SU*(2) 161

Corollary 5.43 shows that one may pass from <sup>ρ</sup>(su(2)) to a unitary representation *u*(*SU*(2)), of which one may give a direct realization. For *j* ∈ N0/2, define *Hj* as the complex vector space of all homogeneous polynomials *p* in two variables*z* = (*z*1,*z*2) of degree 2 *j*. A basis of *Hj* is given by (*z* 2 *j* <sup>1</sup> ,*z* 2 *j*−1 <sup>1</sup> *z*2,...,*z*1*z* 2 *j*−1 <sup>2</sup> ,*z* 2 *j* <sup>2</sup> ), which has 2 *j* +1 elements. So dim(*Hj*) = 2 *j* +1. Then consider the map

$$D\_j: SU(2) \to B(H\_j);\tag{5.198}$$

$$D\_j(\mu)f(z) = f(z\mu). \tag{5.199}$$

Clearly,

$$D\_f(e)f(z) = f(z \cdot 1\_2) = f(z) \,, \tag{5.200}$$

so *Dj*(*e*) = 1, and

$$D\_j(u)D\_j(\upsilon)f(z) = D\_j(\upsilon)f(zu) = f(zu\upsilon) = D\_j(u\upsilon)f(z),$$

so *Dj*(*u*)*Dj*(*v*) = *Dj*(*uv*). Hence *Dj* is a representation of *SU*(2).

We now compute *L*<sup>3</sup> = −<sup>1</sup> <sup>2</sup> *iS*<sup>3</sup> on this space. From (5.156) with *u Dj*, we have

$$L\_3 = -\frac{1}{2} i D\_j' \begin{pmatrix} i & 0 \\ 0 & -i \end{pmatrix} = -\frac{1}{2} i \frac{d}{dt} D\_j \begin{pmatrix} e^{it} & 0 \\ 0 & e^{-it} \end{pmatrix}\_{t=0},\tag{5.201}$$

so that

$$L\_{\Im}f(z) = -\frac{1}{2}i\frac{d}{dt}f(e^{\mathrm{i}t}z\_1, e^{-\mathrm{i}t}z\_2)\_{I=0} = \frac{1}{2}\left(z\_1\frac{\partial f(z)}{\partial z\_1} - z\_2\frac{\partial f(z)}{\partial z\_2}\right). \tag{5.202}$$

Similarly, we obtain

$$L\_{+}f(z) = z\_{1}\frac{\partial f(z)}{\partial z\_{2}};\tag{5.203}$$

$$L\_{-}f(z) = z\_{2}\frac{\partial f(z)}{\partial z\_{1}}.\tag{5.204}$$

Hence *f*<sup>2</sup> *<sup>j</sup>*(*z*) = *z* 2 *j* <sup>1</sup> gives *L*<sup>3</sup> *f*<sup>2</sup> *<sup>j</sup>* = *j f*<sup>2</sup> *<sup>j</sup>*, and *f*0(*z*) = *z* 2 *j* <sup>2</sup> gives *L*<sup>3</sup> *f*<sup>0</sup> = −*j f*0. In general, *fl*(*z*) = *zl* 1*z* 2 *j*−*l* <sup>2</sup> spans the eigenspace *H*<sup>λ</sup> of *L*<sup>3</sup> with eigenvalue λ = −*j* +*l*. Since *l* = 0,1,...,2 *j*, this confirms (5.188), as well as the fact that the corresponding eigenspaces are all one-dimensional. The rest is easily checked, too, except for the unitarity of the representation, for which we refer to the proof of Theorem 5.40.

Finally, we return to *SO*(3). Either explicit exponentiation (5.165), as done for *j* = 1/2 in (5.168), or the above construction of *Dj*, allows one to verify the crucial condition stated in Corollary 5.43, namely that *Dj*(δ) = 1*Hj* for δ ∈ *D* = Z2, which comes down to *Dj*(−12) = 1*Hj* . This is easily seen to be the case iff *j* ∈ N0.

Corollary 5.46. *Up to unitary equivalence, each unitary irreducible representation of SO*(3) *is completely fixed by its dimension n* = 2 *j*+1*, where j* ∈ N<sup>0</sup> *(so that n* = 1 *for spin-0, n* = 3 *for spin-1, n* = 5 *for spin-2, . . . ), and each such dimension occurs.*

#### 5.9 Irreducible representations of compact Lie groups

Because of its importance for the classical-quantum correspondence (cf. §7.1) we first reformulate the main result of the previous section (i.e. the classification the irreducible representations of *SU*(2)) and on that basis generalize this result to arbitrary compact Lie groups. This gives a classification of great simplicity and beauty.

We already encountered the coadjoint representation (3.100) of a Lie group *G* on g∗, given by (*<sup>x</sup>* · <sup>θ</sup>)(*A*) = <sup>θ</sup>(*x*−1*Ax*), where *<sup>x</sup>* <sup>∈</sup> *<sup>G</sup>*, <sup>θ</sup> <sup>∈</sup> g∗, *<sup>A</sup>* <sup>∈</sup> g. The orbits under this action are called *coadjoint orbits*. If *<sup>G</sup>* <sup>=</sup> *SO*(3), we have <sup>g</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup><sup>3</sup> under the map

$$\mathbf{x} \cdot \mathbf{J} \equiv \sum\_{k=1}^{3} \mathbf{x}\_{\mathbf{x}} J\_{l} \mapsto (\mathbf{x}\_{1}, \mathbf{x}\_{2}, \mathbf{x}\_{3}) \equiv \mathbf{x}, \tag{5.205}$$

where the matrices *Jk* are given in (3.66). Hence also <sup>g</sup><sup>∗</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup><sup>3</sup> under the map

$$\theta \mapsto \left( (\theta\_1, \theta\_2, \theta\_3) : \mathbf{x} \mapsto \sum\_{k=1}^3 \theta\_k \mathbf{x}\_k \right). \tag{5.206}$$

Writing *R* ∈ *SO*(3) for a generic element *x* ∈ *G*, analogously to (5.44), we can compute the adoint action *<sup>R</sup>* : *<sup>A</sup>* → *RAR*−1, seen as an action on <sup>R</sup>3, through

$$R(\mathbf{x} \cdot \mathbf{J})\mathcal{R}^{-1} = (R\mathbf{x}) \cdot \mathbf{J}.\tag{5.207}$$

Using the fact that the angular momentum matrices transform as vectors, i.e.,

$$RJ\_{l}R^{-1} = \sum\_{j} R\_{jl}J\_{j},\tag{5.208}$$

we find that the adjoint action of *SO*(3) on g, seen as <sup>R</sup>3, is its defining action. In general, if <sup>g</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup>*<sup>n</sup>* and also <sup>g</sup><sup>∗</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup>*<sup>n</sup>* under the usual pairing of <sup>R</sup>*<sup>n</sup>* and <sup>R</sup>*<sup>n</sup>* through the Euclidean inner product, the coadjoint action of *<sup>G</sup>* on g∗, seen as an action on <sup>R</sup>*n*, is given by the inverse transpose of the adjoint action on <sup>g</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup>*n*. For *SO*(3) we have (*R*−1)*<sup>T</sup>* = *R*, so the coadjoint action of *SO*(3) on R<sup>3</sup> is just its defining action, too, and hence the coadjoint orbits are the 2-spheres *Sr* with radius *r* ≥ 0.

Turning to *SU*(2), we now make the identification of g<sup>∗</sup> with <sup>R</sup><sup>3</sup> slightly differently, namely by replacing the 3×3 real matrices *Ji* in (5.205) by the 2×2 matrices *Si* in (5.174), but the computation is similar: using (5.44) - (5.45), we find that the coadjoint action of *<sup>u</sup>* <sup>∈</sup> *SU*(2) on <sup>R</sup><sup>3</sup> is given by the defining action of <sup>π</sup>˜(*u*) <sup>∈</sup> *SO*(3), cf. (5.46). It follows that the coadjoint orbits for *SU*(2) are the same as for *SO*(3).

Returning to general Lie groups *G* for the moment, assumed connected for simplicity, we take some coadjoint orbit <sup>O</sup> <sup>⊂</sup> g∗, fix a point <sup>θ</sup> <sup>∈</sup> <sup>O</sup> (so that <sup>O</sup> <sup>=</sup> *<sup>G</sup>*·<sup>θ</sup> <sup>≡</sup> *<sup>G</sup>*<sup>θ</sup> ), and look at the stabilizer *<sup>G</sup>*<sup>θ</sup> and its Lie algebra <sup>g</sup><sup>θ</sup> . Since the derivative Ad of the adjoint action Ad of *<sup>G</sup>* on g—defined as in (5.156)—is given by

$$\text{Ad}'(A) : B \mapsto [A, B],\tag{5.209}$$

it follows that the "infinitesimal stabilizer" <sup>g</sup><sup>θ</sup> is given by

$$\mathfrak{g}\_{\theta} = \{ A \in \mathfrak{g} \mid \theta([A, B]) = 0 \,\forall B \in \mathfrak{g} \}. \tag{5.210}$$

Consequently, the restriction of <sup>θ</sup> : <sup>g</sup> <sup>→</sup> <sup>R</sup> to <sup>g</sup><sup>θ</sup> <sup>⊂</sup> <sup>g</sup> is a Lie algebra homomorphism (where R is obviously endowed with the zero Lie bracket). Consider a *character* χ : *G*<sup>θ</sup> → T, which is the same thing as a one-dimensional unitary representation of *<sup>G</sup>*<sup>θ</sup> . If we regard <sup>T</sup> as a closed subgroup of *GL*1(C), its Lie algebra <sup>t</sup> is given by *i*R ⊂ *M*1(C) = C. It is conventional (at least among physicists) to take −*i* as the basis element of <sup>t</sup>, so that <sup>t</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup> under <sup>−</sup>*it* <sup>↔</sup> *<sup>t</sup>*, so that the exponential map exp : t <sup>→</sup> <sup>T</sup> (which is the usual one), seen as a map from <sup>R</sup> to <sup>T</sup>, is given by *<sup>t</sup>* → exp(−*it*). Defining the derivative <sup>χ</sup> : <sup>g</sup><sup>θ</sup> <sup>→</sup> <sup>C</sup> as in (5.156), it follows that actually <sup>χ</sup> : <sup>g</sup><sup>θ</sup> <sup>→</sup> *<sup>i</sup>*R, so that *<sup>i</sup>*<sup>χ</sup> maps <sup>g</sup><sup>θ</sup> to <sup>R</sup> and is a Lie algebra homomorphism.

Definition 5.47. *Let G be a connected Lie group. A coadjoint orbit* <sup>O</sup> <sup>⊂</sup> <sup>g</sup><sup>∗</sup> *is called* integral *if for some (and hence all)* <sup>θ</sup> <sup>∈</sup> <sup>O</sup> *one has* <sup>θ</sup>|g<sup>θ</sup> <sup>=</sup> *<sup>i</sup>*<sup>χ</sup> *for some character* <sup>χ</sup> : *<sup>G</sup>*<sup>θ</sup> <sup>→</sup> <sup>T</sup>*, i.e., if there is a character* <sup>χ</sup> *such that for each A* <sup>∈</sup> <sup>g</sup><sup>θ</sup> *one has*

$$\theta(A) = i \frac{d}{dt} \mathcal{X} \left( e^{tA} \right)\_{|t=0}. \tag{5.211}$$

In the simplest case where *<sup>G</sup>* <sup>=</sup> <sup>T</sup>, the coadjoint action on t ∗ is evidently trivial, so that *<sup>G</sup>*<sup>θ</sup> <sup>=</sup> *<sup>G</sup>* <sup>=</sup> <sup>T</sup> for any <sup>θ</sup> <sup>∈</sup> <sup>t</sup> <sup>∗</sup> ∼= R. Furthermore, any character on T takes the form <sup>χ</sup>*n*(*z*) = *zn*, where *<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>, cf. (C.351). As explained above, if <sup>t</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup> and hence also t <sup>∗</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup>, the identification of <sup>λ</sup> <sup>∈</sup> <sup>t</sup> <sup>∗</sup> with λ ∈ R is made by λ(−*i*) ↔ λ, where <sup>−</sup>*<sup>i</sup>* <sup>∈</sup> t. If <sup>χ</sup> <sup>=</sup> <sup>χ</sup>*n*, the right-hand side of (5.211) evaluated at *<sup>A</sup>* <sup>=</sup> <sup>−</sup>*<sup>i</sup>* equals *<sup>n</sup>*, so that (5.211) holds iff <sup>θ</sup> <sup>=</sup> *<sup>n</sup>* for some *<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>. Thus the integral coadjoint orbits in t ∗ are the integers <sup>Z</sup> <sup>⊂</sup> <sup>R</sup>. Similarly, if *<sup>G</sup>* <sup>=</sup> <sup>T</sup>*d*, the characters are elements of <sup>Z</sup>*d*, as in

$$\mathcal{X}\_{(n\_1,\ldots,n\_d)}(z\_1,\ldots,z\_d) = z\_1^{n\_1}\cdots z\_d^{n\_d},\tag{5.212}$$

and the integral coadjoint orbits in <sup>g</sup><sup>∗</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup>*<sup>d</sup>* are the points of the lattice <sup>Z</sup>*<sup>d</sup>* <sup>⊂</sup> <sup>R</sup>*d*.

For *G* = *SU*(2) we take a coadjoint orbit *S*<sup>2</sup> *<sup>r</sup>* <sup>⊂</sup> <sup>R</sup><sup>3</sup> and fix <sup>θ</sup>*<sup>r</sup>* = (0,0,*r*). If *<sup>r</sup>* <sup>=</sup> 0, then *G*<sup>θ</sup> = *G* and (5.211) holds for the trivial character χ ≡ 1, so the orbit {(0,0,0)} is integral. Let *r* > 0. Then *G*θ*<sup>r</sup>* ≡ *Gr* consist of the pre-image of *SO*(2) in *SU*(2) under the projection π˜ in (5.46), where *SO*(2) ⊂ *SO*(3) is the group of rotations around the *z*-axis. This is the abelian group

$$T = \{ \text{diag}(z, \overline{z}) \mid z \in \mathbb{T} \}. \tag{5.213}$$

This group is isomorphic to T under diag(*z*,*z*) → *z* and hence its characters are given by <sup>χ</sup>*n*(diag(*z*,*z*)) = *zn*, where *<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>. The identification <sup>g</sup><sup>∗</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup><sup>3</sup> is made by identifying <sup>θ</sup> <sup>∈</sup> <sup>g</sup><sup>∗</sup> with (θ1,θ2,θ3), where <sup>θ</sup><sup>1</sup> <sup>=</sup> <sup>θ</sup>(*Si*). Putting *<sup>A</sup>* <sup>=</sup> *<sup>S</sup>*<sup>3</sup> in (5.211), see (5.174), therefore gives *r* = *n*/2 for some *n* ∈ N. We conclude that the coadjoint orbits for *SU*(2) are given by the two-spheres *S*<sup>2</sup> *<sup>r</sup>* <sup>⊂</sup> <sup>R</sup><sup>3</sup> with *<sup>r</sup>* <sup>∈</sup> <sup>N</sup>0/2.

Similarly, for *G* = *SO*(3) the stabilizer of (0,0,*r*) is *SO*(2) ∼= T itself, and putting *A* = *J*<sup>3</sup> in (5.211) one finds that the coadjoint orbits are the spheres *S*<sup>2</sup> *<sup>r</sup>* with *r* ∈ N0.

For any (Lie) group *G*, let the *unitary dual G*ˆ be the set whose elements are equivalence classes of unitary irreducible representations of *G*, where we say:

Definition 5.48. *Two unitary representations ui* : *G* → *U*(*Hi*)*, i* = 1,2*, are* equivalent *if there is unitary v* : *H*<sup>1</sup> → *H*<sup>2</sup> *such that u*2(*x*) = *vu*1(*x*)*v*<sup>∗</sup> *for each x* ∈ *G.*

The examples *G* = T*<sup>d</sup>* as well as for *G* = *SU*(2) now suggest the following theorem:

Theorem 5.49. *If G is a compact connected Lie group, then the unitary dual G is* ˆ *parametrized by the set of integral coadjoint orbits in* g∗*.*

Furthermore, there is an explicit (geometric) procedure to a construct an irreducible representation *u*<sup>O</sup> corresponding to such an orbit, namely by the method of *geometric quantization*. We will not explain this method, which would require some reasonably advanced differential geometry, but instead we outline the connection between coadjoint orbits and the well-known *method of the highest weight*.

Let *G* be a compact connected Lie group and pick a maximal torus *T* ⊂ *G*. Let

$$W\_T = N(T)/T \tag{5.214}$$

be the corresponding *Weyl group*, where *N*(*T*) is the normalizer of *T* in *G* (i.e., *<sup>x</sup>* <sup>∈</sup> *<sup>N</sup>*(*T*) iff *xzx*−<sup>1</sup> <sup>∈</sup> *<sup>T</sup>* for each *<sup>z</sup>* <sup>∈</sup> *<sup>T</sup>*). Note that all maximal tori in compact connected Lie groups are conjugate, so that the specific choice of *T* is irrelevant.

For example, for *SU*(2) we take (5.213), in which case *N*(*T*) is generated by *T* and <sup>σ</sup><sup>1</sup> <sup>∈</sup> *SU*(2), so that *<sup>W</sup>* <sup>∼</sup><sup>=</sup> <sup>S</sup>2, i.e., the permutation group on two variables. In general the Weyl group inherits the adjoint action of *N*(*T*) on *T*, so that *WT* acts on *<sup>T</sup>* and hence also acts on t and t ∗; for *SU*(2) the action of the nontrivial element of *WT* , i.e., image [σ1] of σ<sup>1</sup> ∈ *N*(*T*) in *N*(*T*)/*T*), on *T* is given by

$$[\sigma\_1](\text{diag}(z,\overline{z})) = \text{diag}(\overline{z},z),\tag{5.215}$$

so that its action on <sup>T</sup> <sup>∼</sup><sup>=</sup> *<sup>T</sup>* is *<sup>z</sup>* → *<sup>z</sup>*, which gives rise to actions *<sup>A</sup>* → −*<sup>A</sup>* of *WT* on <sup>t</sup> and hence <sup>λ</sup> → −<sup>λ</sup> of *WT* on <sup>t</sup> ∗. This is a special case of the following bijection:

$$\mathfrak{g}^\*/G \cong \mathfrak{t}^\*/W\_T,\tag{5.216}$$

where the *<sup>G</sup>*-action on <sup>g</sup><sup>∗</sup> is the coadjoint one; globally, one has *<sup>G</sup>*/Ad(*G*) <sup>∼</sup><sup>=</sup> *<sup>T</sup>*/*WT* .

Indeed, for *SU*(2) the left-hand side of (5.216) is the set of spheres *S*<sup>2</sup> *<sup>r</sup>* in R3, *<sup>r</sup>* <sup>≥</sup> 0, whereas the right-hand side is <sup>R</sup>/S<sup>2</sup> (where <sup>S</sup><sup>2</sup> acts on <sup>R</sup> by <sup>θ</sup> → −θ).

In general, a given coadjoint orbit <sup>O</sup> <sup>⊂</sup> <sup>g</sup><sup>∗</sup> defines a Weyl group orbit <sup>O</sup>*<sup>W</sup>* in <sup>t</sup> ∗ as follows: O contains a point θ for which *T* ⊆ *G*<sup>θ</sup> , and we take O*<sup>W</sup>* to be the orbit through <sup>θ</sup>|t. Conversely, any *<sup>G</sup>*-invariant inner product on <sup>g</sup> induces a decomposition

$$
\mathfrak{g} = \mathfrak{t} \oplus \mathfrak{t}^{\perp},
\tag{5.217}
$$

which yields an extension of <sup>λ</sup> <sup>∈</sup> t <sup>∗</sup> to θλ <sup>∈</sup> <sup>g</sup><sup>∗</sup> that vanishes on *<sup>t</sup>* <sup>⊥</sup>. Let <sup>Λ</sup> <sup>⊂</sup> t ∗ be the set of integral elements in t ∗ (as explained after Definition 5.47). Elements of Λ are called *weights*. Theorem 5.51 below gives a parametrization

$$
\hat{G} \cong \Lambda/\mathcal{W}\_T,\tag{5.218}
$$

which, restricting (5.216) to the integral part <sup>Λ</sup> <sup>⊂</sup> t ∗, implies Theorem 5.49.

Instead of with the quotient Λ/*WT* , one may prefer to work with Λ itself, as follows: we say that <sup>λ</sup> <sup>∈</sup> t <sup>∗</sup> is *regular* if *w*· λ for *w* ∈ *WT* iff *w* = *e*; this is the case iff <sup>λ</sup> <sup>=</sup> <sup>θ</sup>|t with *<sup>G</sup>*<sup>θ</sup> <sup>=</sup> *<sup>T</sup>*. For *SU*(2) all weights <sup>λ</sup> <sup>∈</sup> <sup>Z</sup> are regular except <sup>λ</sup> <sup>=</sup> 0. The set t ∗ *<sup>r</sup>* of regular elements of <sup>t</sup> ∗ falls apart into connected components *C*, called *Weyl chambers*, which are mapped into each other by *WT* . For *SU*(2) one has <sup>t</sup> ∗ = (−∞,0)∪(0,∞), so that the Weyl chambers are (−∞,0) and (0,∞).

One picks an arbitrary Weyl chamber *Cd* (for *SU*(2) this is (0,∞)) and forms

$$
\Lambda\_d = \Lambda \cap \mathcal{C}\_d^-,\tag{5.219}
$$

where *C*− *<sup>d</sup>* is the closure of *Cd* in <sup>t</sup> <sup>∗</sup>. Elements of Λ*<sup>d</sup>* are called *dominant weights*. For each element of Λ/*WT* there is a unique dominant weight representing it in Λ, so that instead of (5.218) we may also write what Theorem 5.51 actually gives, viz.

$$
\hat{G} \cong \Lambda\_d. \tag{5.220}
$$

To explain this in some detail, we need further preparation. Any (unitary) representation *u* : *G* → *U*(*H*) on some finite-dimensional Hilbert space *H* restricts to *T*, and since *T* is abelian, we may simultaneously diagonalize all operators *u*(*z*), *z* ∈ *T*. The operators *iu* (*A*), where *<sup>A</sup>* <sup>∈</sup> t, commute as well, so that we may decompose

$$H = \bigoplus\_{\mu \in \Lambda \mu} H\_{\mu},\tag{5.221}$$

where Λ*<sup>H</sup>* ⊂ Λ contains the weights that occur in *u*|*<sup>T</sup>* , so that for each ψ ∈ *H*<sup>μ</sup> ,

$$
\mu(z)\Psi = \mathcal{X}\mu(z)\Psi \ (z \in T); \tag{5.222}
$$

$$
\dot{m}'(Z)\Psi = \mu(Z)\Psi \text{ ( $Z \in \mathfrak{t}$ )},\tag{5.223}
$$

where the character χμ : *T* → T corresponding to the weight μ ∈ Λ is defined as in (5.212) with <sup>μ</sup> = (*n*1,...,*nd*) and *<sup>z</sup>* = (*z*1,...,*zd*) <sup>∈</sup> *<sup>T</sup>* <sup>∼</sup><sup>=</sup> <sup>T</sup>*d*, where *<sup>d</sup>* <sup>=</sup> dim(*T*). For example, we have seen that the irreducible representations *Dj*(*SU*(2)) on *Hj* ∼= <sup>C</sup><sup>2</sup> *<sup>j</sup>*+<sup>1</sup> contains weights in <sup>Λ</sup>*<sup>j</sup>* <sup>=</sup> {−*j*,−*<sup>j</sup>* <sup>+</sup>1,..., *<sup>j</sup>* <sup>−</sup>1, *<sup>j</sup>*}, where *<sup>j</sup>* <sup>∈</sup> <sup>N</sup>0/2.

In particular, take *<sup>H</sup>* <sup>=</sup> <sup>g</sup><sup>C</sup> with some *<sup>G</sup>*-invariant inner product, cf. (5.148), and take *u* = Ad, given by Ad(*x*)*B* = *xBx*−1, so that Ad (*A*)(*B*)=[*A*,*B*], extended from <sup>g</sup> to <sup>g</sup>C: we write <sup>g</sup><sup>C</sup> <sup>=</sup> <sup>g</sup>+*i*<sup>g</sup> and hence put Ad (*A*)(*B*+*iC*)=[*A*,*B*]+*i*[*A*,*C*], where *<sup>A</sup>*,*B*,*<sup>C</sup>* <sup>∈</sup> <sup>g</sup>. We assume that the inner product ·,·, on <sup>g</sup><sup>C</sup> is obtained from a real inner product on g by complexification. This inner product on g may be restricted to t <sup>⊂</sup> g and hence induces an inner product on t <sup>∗</sup>, also denoted by ·,·,. For example, if *<sup>G</sup>* is semi-simple (like *SU*(2)), one may take the inner product on g and hence on <sup>g</sup><sup>C</sup> to be the Cartan–Killing form *A*,*B* <sup>=</sup> <sup>−</sup><sup>1</sup> 2Tr(Ad (*A*)Ad (*B*)), which is nondegenerate because *G* is semi-simple, and positive definite since *G* is compact. For *SU*(2) or *SO*(3) this gives the usual inner product on R<sup>3</sup> and C3.

Definition 5.50. *The* roots *of* g *are the* nonzero *weights of the adjoint representation <sup>u</sup>* <sup>=</sup> Ad *on H* <sup>=</sup> gC*. That is, writing* <sup>Δ</sup> <sup>⊂</sup> <sup>Λ</sup> *for the set of roots, we have* <sup>α</sup> <sup>∈</sup> <sup>Δ</sup> *iff* <sup>α</sup> : <sup>t</sup> <sup>→</sup> <sup>R</sup> *is not identically zero and there is some E*<sup>α</sup> <sup>∈</sup> <sup>g</sup><sup>C</sup> *such that for each Z* <sup>∈</sup> <sup>t</sup>*,*

$$i[Z, E\_{\alpha}] = \alpha(Z)E\_{\alpha},\tag{5.224}$$

*cf.* (5.223)*. Furthermore, subject to the choice of a preferred Weyl chamber Cd in* <sup>t</sup> ∗ *r, we say* <sup>α</sup> <sup>∈</sup> <sup>Δ</sup> *is* positive*, denoted by* <sup>α</sup> <sup>∈</sup> <sup>Δ</sup>+*, if* α,λ <sup>&</sup>gt; <sup>0</sup> *for each* <sup>λ</sup> <sup>∈</sup> *Cd.*

Since α,λis real and nonzero for each <sup>α</sup> <sup>∈</sup> <sup>Δ</sup> and <sup>λ</sup> <sup>∈</sup>*Cd*, one has either <sup>α</sup> <sup>∈</sup> <sup>Δ</sup><sup>+</sup> or <sup>−</sup><sup>α</sup> <sup>∈</sup> <sup>Δ</sup>+, i.e., <sup>α</sup> <sup>∈</sup> <sup>Δ</sup><sup>−</sup> <sup>=</sup> <sup>−</sup>Δ+. Since <sup>t</sup> is maximal abelian in <sup>g</sup>, it can also be shown that each root is nondegenerate. Writing <sup>g</sup><sup>α</sup> <sup>=</sup> <sup>C</sup>·*E*α, this gives a decomposition

$$\mathfrak{g}\_{\mathbb{C}} = \mathfrak{t}\_{\mathbb{C}} \bigoplus\_{a \in \mathbb{A}^+} \mathfrak{g}\_a \bigoplus\_{a \in \mathbb{A}^-} \mathfrak{g}\_{a^\cdot} \tag{5.225}$$

For *<sup>G</sup>* <sup>=</sup> *SU*(2), the single generator of <sup>t</sup> is *<sup>S</sup>*3, and taking *<sup>E</sup>*<sup>±</sup> <sup>=</sup> *<sup>i</sup>*(*S*<sup>1</sup> <sup>±</sup>*iS*2), we see from (5.180) that *i*[*S*3,*E*±] = ±*E*±. Hence the roots are α±, given by α±(*S*3) = ±1, and with (0,∞) as the Weyl chamber of choice, the root α<sup>+</sup> is the positive one.

We now define a partial ordering ≤ on Λ by putting μ ≤ λ iff λ − μ = ∑*<sup>i</sup> ni*α*<sup>i</sup>* for some *ni* <sup>∈</sup> <sup>N</sup><sup>0</sup> and <sup>α</sup>*<sup>i</sup>* <sup>∈</sup> <sup>Δ</sup>+. This brings us to the *theorem of the highest weight*:

Theorem 5.51. *Let G be a connected compact Lie group. There is a parametrization <sup>G</sup>*<sup>ˆ</sup> <sup>∼</sup><sup>=</sup> <sup>Λ</sup>*d, such that any unitary irreducible representation u*<sup>λ</sup> : *<sup>G</sup>* <sup>→</sup> *<sup>H</sup>*<sup>λ</sup> *in the class* <sup>λ</sup> <sup>∈</sup> *G defined by a given dominant weight* <sup>ˆ</sup> <sup>λ</sup> <sup>∈</sup> <sup>Λ</sup>*<sup>d</sup> has the following properties:*

*1. H*<sup>λ</sup> *contains a unit vector* υλ *, unique up to a phase, such that*

$$\dot{\mathfrak{u}}\_{\lambda}^{\prime}(Z)\mathfrak{v}\_{\lambda} = \mathfrak{A}(Z)\mathfrak{v}\_{\lambda} \text{ ( $Z \in \mathfrak{t}$ )};\tag{5.226}$$

$$
\dot{u}'\_{\lambda}(E\_{\alpha})\mathfrak{v}\_{\lambda} = 0 \ (\mathfrak{a} \in \Delta^{+}).\tag{5.227}
$$

*2. Any other weight* μ *occurring in H, cf.* (5.221)*, satisfies* μ ≤ λ *and* μ = λ*.*

The crucial point is that eqs. (5.226) - (5.227) imply

$$\Theta\_{\lambda}(A) = i \langle \mathfrak{v}\_{\lambda}, \mu\_{\lambda}'(A)\mathfrak{v}\_{\lambda} \rangle \ (A \in \mathfrak{g}), \tag{5.228}$$

where θλ <sup>∈</sup> <sup>g</sup><sup>∗</sup> was defined after (5.217) by <sup>λ</sup> <sup>∈</sup> <sup>Λ</sup>*<sup>d</sup>* <sup>⊂</sup> <sup>t</sup> <sup>∗</sup>. Since each operator *u*<sup>λ</sup> (*x*) is unitary, each vector *u*<sup>λ</sup> (*x*)υλ is a unit vector, so we may form the *G*-orbit

$$\mathcal{O}'\_{\lambda} = \{ |\mu\_{\lambda}(\mathbf{x}) \mathfrak{v}\_{\lambda} \rangle \langle \mu\_{\lambda}(\mathbf{x}) \mathfrak{v}\_{\lambda} |, \mathfrak{x} \in G \} \tag{5.229}$$

through |υλ υλ | in the space P1(*H*<sup>λ</sup> ) of all one-dimensional projections on *H*<sup>λ</sup> . Denoting the coadjoint orbit *<sup>G</sup>*· θλ <sup>⊂</sup> <sup>g</sup><sup>∗</sup> by <sup>O</sup><sup>λ</sup> , where <sup>λ</sup> = (θλ )|t, the map

$$
\alpha \cdot \theta\_{\lambda} \mapsto |u\_{\lambda}(\mathbf{x})\vartheta\_{\lambda}\rangle\langle u\_{\lambda}(\mathbf{x})\vartheta\_{\lambda}|,\tag{5.230}
$$

is a *G*-equivariant diffeomorphism (in fact, a symplectomorphism) from O<sup>λ</sup> to O λ . This amplifies Theorem 5.49 by making the the bijective correspondence between the set Λ*<sup>d</sup>* of dominant weights and the set of integral coadjoint orbits explicit.

#### 5.10 Symmetry groups and projective representations

Despite the power and beauty of unitary group representations in *mathematics*, in the context of e.g. Wigner's Theorem we have seen that in *physics* one should look at homomorphisms *x* → W(*x*), where W(*x*) is a symmetry of P1(*H*). In view of Theorems 5.4, this is equivalent to considering a *single* homomorphism *<sup>h</sup>* : *<sup>G</sup>* → <sup>G</sup> *<sup>H</sup>*, cf. (5.136). To simplify the discussion, we now drop *Ua*(*H*) from consideration and just deal with the connected component G *<sup>H</sup>* <sup>0</sup> = *U*(*H*)/T of the identity. This restriction may be justified by noting that in what follows we will only deal with symmetries given by *connected Lie groups*, which have the property that each element is a product of squares *x* = *y*2. In that case, *h*(*x*) = *h*(*y*)<sup>2</sup> is always a square and hence it cannot lie in the component *Ua*(*H*)/T (the anti-unitary case does play a role as soon as *discrete* symmetries are studied, such as time inversion, parity, or charge conjugation). Thus in what follows we will study continuous homomorphisms

$$h: G \to U(H)/\mathbb{T},\tag{5.231}$$

where *U*(*H*)/T has the quotient topology inherited from the strong operator topology on *U*(*H*), as explained above. Since it is inconvenient to deal with such a quotient, we try to lift *h* to some map (5.137) where, in terms of the canonical projection

$$
\pi: U(H) \to U(H)/\mathbb{T},\tag{5.232}
$$

which is evidently a group homomorphism, we have

$$
\pi \circ \mathfrak{u} = \mathfrak{h}.\tag{5.233}
$$

This can be done by choosing a cross-section *s* of π, that is, a measurable map

$$s: U(H)/\mathbb{T} \to U(H),\tag{5.234}$$

or (this doesn't matter much) a map *s* : *h*(*G*)/T → *U*(*H*), such that

$$
\mathfrak{a} \circ \mathfrak{s} = \mathrm{id}.\tag{5.235}
$$

Given *h*, such a cross-section *s* yields a map *u* : *G* → *U*(*H*) through

$$
\mu = \mathbf{s} \circ h;\tag{5.236}
$$

in particular, π(*u*(*x*)) = *h*(*x*). Such a lift often loses the homomorphism property, *though in a controlled way*, as follows. Since different choices of *s* must differ by a phase, and *h* is a homomorphism of groups, there must be a function

$$c: G \times G \to \mathbb{T} \tag{5.237}$$

such that

$$
\mu(\mathbf{x})\mu(\mathbf{y}) = c(\mathbf{x}, \mathbf{y})\mu(\mathbf{x}\mathbf{y}) \text{ (x, y \in G)}.\tag{5.238}
$$

168 5 Symmetry in quantum mechanics

Indeed, since π and *h* are homomorphisms, we may compute

$$\begin{aligned} \pi(\mathfrak{u}(\mathfrak{x})\mathfrak{u}(\mathfrak{y})\mathfrak{u}(\mathfrak{x}\mathfrak{y})^{-1}) &= \pi(\mathfrak{s}(h(\mathfrak{x}))\mathfrak{x}(\mathfrak{s}(h(\mathfrak{y}))\mathfrak{x}(\mathfrak{s}(h(\mathfrak{x}\mathfrak{y}))))^{-1} \\ &= h(\mathfrak{x}\mathfrak{y})h(\mathfrak{x}\mathfrak{y})^{-1} = h(e\_G) = e\_{U(H)/\mathbb{T}}. \end{aligned}$$

Hence *<sup>u</sup>*(*x*)*u*(*y*)*u*(*xy*)−<sup>1</sup> <sup>∈</sup> <sup>π</sup>−1(*eU*(*H*)/T) = <sup>T</sup> · <sup>1</sup>*H*, which yields (5.238), or, more directly,

$$c(\mathbf{x}, \mathbf{y}) \cdot \mathbf{1}\_H = \mu(\mathbf{x})\mu(\mathbf{y})\mu(\mathbf{x}\mathbf{y})^\*. \tag{5.239}$$

Associativity of multiplication in *G* and the homomorphism property of *h* yield

$$c(\mathbf{x}, \mathbf{y})c(\mathbf{x}\mathbf{y}, \mathbf{z}) = c(\mathbf{x}, \mathbf{y}\mathbf{z})c(\mathbf{y}, \mathbf{z}),\tag{5.240}$$

and if we impose the natural requirement *ue* = 1*H*, we also have

$$c(e, \mathbf{x}) = c(\mathbf{x}, e) = 1.\tag{5.241}$$

Definition 5.52. *A function c* : *G*×*G* → T *satisfying* (5.240) *and* (5.241) *is called a* multiplier *or* C@2-cocycle *on G (in the topological case one requires c to be Borel measurable, and for Lie groups it should in addition be smooth* near the identity*). The set of such multipliers, seen as an abelian group under (pointwise) operations in* T*, is denoted by Z*2(*G*,T)*. If c takes the form*

$$c(\mathbf{x}, \mathbf{y}) = \frac{b(\mathbf{x}\mathbf{y})}{b(\mathbf{x})b(\mathbf{y})},\tag{5.242}$$

*where b* : *G* → T *satisfies b*(*e*) = 1 *(and is measurable and smooth near e as appropriate), then c is called a* 2-coboundary *or an* exact multiplier*. The set of trivial multipliers forms a (normal) subgroup B*2(*G*,T) *of Z*2(*G*,T)*, and the quotient*

$$H^2(G, \mathbb{T}) = \frac{Z^2(G, \mathbb{T})}{B^2(G, \mathbb{T})} \tag{5.243}$$

*is called the* second cohomology group *of G with coefficients in* T*.*

The reason 2-coboundaries and the ensuing group *H*2(*G*,T) are interesting for our problem is as follows. Given a map *x* → *u*(*x*) from *G* to *U*(*H*) with (5.238), suppose we change *u*(*x*) to *u*(*x*) = *b*(*x*)*u*(*x*). The associated multiplier then changes to

$$c'(\mathbf{x}, \mathbf{y}) = \frac{b(\mathbf{x})b(\mathbf{y})}{b(\mathbf{x}\mathbf{y})}c(\mathbf{x}, \mathbf{y}),\tag{5.244}$$

in that *u*(*x*) *u*(*y*) = *c* (*x*, *y*)*u xy*. In particular, a multiplier of the form (5.242) may be removed by such a transformation, and is accordingly called *exact*.

Proposition 5.53. *If H*2(*G*,T) *is trivial, then any multiplier can be removed by modifying the lift u of h, and the ensuing map u* : *G* → *U*(*H*) *is a homomorphism and hence a unitary representation of G on H. In that case, any homomorphism G* → *U*(*H*)/T *comes from a unitary representation u* : *G* → *U*(*H*) *through* (5.233)*.*

This is true by construction. By the same token, if *H*2(*G*,T) is non-trivial, then *G* will have projective representations that cannot be turned into ordinary ones by a change of phase (for it can be shown that any multiplier *<sup>c</sup>* <sup>∈</sup> *<sup>Z</sup>*2(*G*,T) is realized by some projective representation). Thus it is important to compute *H*2(*G*,T) for any given (physically relevant) group *G*, and see what can be done if it is non-trivial.

To this end we present the main results of practical use. In order to state one of the main results (Whitehead's Lemma), we need to set up a cohomology theory for g (which we only need with trivial coefficients). Let *<sup>C</sup>k*(g,R) be the abelian group of all *<sup>k</sup>*-linear totally antisymmetric maps <sup>ϕ</sup> : g*<sup>k</sup>* <sup>→</sup> <sup>R</sup>, with *coboundary maps*

$$\mathfrak{G}^{(k)}: \mathcal{C}^{k}(\mathfrak{g}, \mathbb{R}) \to \mathcal{C}^{k+1}(\mathfrak{g}, \mathbb{R});\tag{5.245}$$

$$\Phi(\mathbf{X}\_0, \mathbf{X}\_1, \dots, \mathbf{X}\_k) \mapsto \sum\_{i$$

where the hat means that the corresponding entry is omitted. For example, we have

$$\begin{aligned} \delta^{(1)}\mathfrak{q}(X\_0, X\_1) &= -\mathfrak{q}([X\_0, X\_1]);\\ \delta^{(2)}\mathfrak{q}(X\_0, X\_1, X\_2) &= -\mathfrak{q}([X\_0, X\_1], X\_2) + \mathfrak{q}([X\_0, X\_2], X\_1) - \mathfrak{q}([X\_1, X\_2], X\_0). \end{aligned}$$

These maps satisfy "δ <sup>2</sup> = 0", or, more precisely,

$$
\mathfrak{G}^{(k+1)} \circ \mathfrak{G}^{(k)} = 0,\tag{5.247}
$$

and hence we may define the following abelian groups:

$$B^k(\mathfrak{g}, \mathbb{R}) = \text{ran}(\mathcal{S}^{(k-1)});\tag{5.248}$$

$$Z^k(\mathfrak{g}, \mathbb{R}) = \ker(\mathfrak{G}^{(k)});\tag{5.249}$$

$$H^k(\mathfrak{g}, \mathbb{R}) = \frac{Z^k(\mathfrak{g}, \mathbb{R})}{B^k(\mathfrak{g}, \mathbb{R})}. \tag{5.250}$$

Note that *<sup>B</sup>k*(g,R) <sup>⊆</sup> *<sup>H</sup>k*(g,R) because of (5.247). In particular, for *<sup>k</sup>* <sup>=</sup> 2 the group *<sup>Z</sup>*2(g,R) of all *2-cocycles* on g consists of all bilinear maps <sup>ϕ</sup> : g×g <sup>→</sup> <sup>R</sup> that satisfy

$$
\mathfrak{φ}(X,Y) = -\mathfrak{φ}(Y,X);
\tag{5.251}
$$

$$
\Phi(X, [Y, Z]) + \Phi(Z, [X, Y]) + \Phi(Y, [Z, X]) = 0,\tag{5.252}
$$

and its subgroup *<sup>B</sup>*2(g,R) of all *2-coboundaries* comprises all <sup>ϕ</sup> taking the form

$$\mathfrak{g}(X,Y) = \mathfrak{G}([X,Y]), \ \mathfrak{G} \in \mathfrak{g}^\*.\tag{5.253}$$

For example, for <sup>g</sup> <sup>=</sup> <sup>R</sup> any antisymmetric bilinear map <sup>ϕ</sup> : <sup>R</sup><sup>2</sup> <sup>→</sup> 0 is zero, so that

$$H^2(\mathbb{R}, \mathbb{R}) = 0.\tag{5.254}$$

This has nothing to with the fact that the Lie bracket on g vanishes. Indeed, g <sup>=</sup> <sup>R</sup><sup>2</sup> does admit a unique nontrivial 2-cocycle, given by (half) the symplectic form, i.e.,

$$
\mathfrak{sp}\_0((p,q),(p',q')) = \frac{1}{2}(pq'-qp').\tag{5.255}
$$

Since *B*2(R2,R) = 0, this cannot be removed, hence (5.255) generates *H*2(R2,R):

$$H^2(\mathbb{R}^2, \mathbb{R}) \cong \mathbb{R}.\tag{5.256}$$

As far as cohomology is concerned, each Lie group and each Lie algebra has its own story, although in some cases a group of stories may be collected into a single narrative. As a case in point, a Lie algebra g is called *simple* when it has no proper ideals, and *semi-simple* when it has no commutative ideals. A Lie algebra is semisimple iff it is a direct sum of simple Lie algebras. If a Lie group *G* is (semi-) simple, then so is its Lie algebra g. A basic result, often called *Whitehead's Lemma*, is:

Lemma 5.54. *If* g *is semi-simple, then H*2(g,R) = <sup>0</sup>*.*

*Proof.* The key point is that *<sup>C</sup>k*(g,R) is a g-module under the action

$$(X\_0 \cdot \varphi)(X\_1, \dots, X\_k) = -\sum\_{i=1}^k \varphi(X\_1, \dots, [X\_0, X\_i], \dots, X\_k). \tag{5.257}$$

For *k* = 2, a simple computation shows that

$$\begin{split} (X\_0 \cdot \boldsymbol{\varrho})(X\_1, X\_2) &= -\boldsymbol{\varrho}([X\_0, X\_1], X\_2) - \boldsymbol{\varrho}(X\_1, [X\_0, X\_2]) \\ &= \boldsymbol{\delta}^{(2)} \boldsymbol{\varrho}(X\_0, X\_1, X\_2) - \boldsymbol{\delta}^{(1)} \boldsymbol{\varrho}(X\_0, -)(X\_1, X\_2), \end{split} \tag{5.258}$$

where at fixed *<sup>X</sup>*0, the map <sup>ϕ</sup>(*X*0,−) is seen as an element of *<sup>C</sup>*1(g,R). This show that g maps both *<sup>B</sup>*2(g,R) and *<sup>Z</sup>*2(g,R) onto itself. Indeed, if <sup>ϕ</sup> <sup>=</sup> <sup>δ</sup>(1) χ, then the first term in (5.258) vanishes because <sup>δ</sup>(2) ◦ <sup>δ</sup>(1) <sup>=</sup> 0, cf. (5.247), so that the righthand side of (5.258) takes the form δ(1) (···) and hence lies in *<sup>B</sup>*2(g,R). Similarly, if δ(2) ϕ = 0, then δ(2) (*X*<sup>0</sup> · <sup>ϕ</sup>) = 0. We now use the fact that if <sup>g</sup> is semi-simple, then any finite-dimensional module is completely reducible. Consequently, as a gmodule, *<sup>Z</sup>*2(g,R) must decompose as *<sup>Z</sup>*2(g,R) = *<sup>B</sup>*2(g,R) <sup>⊕</sup>*V*, where *<sup>V</sup>* is some <sup>g</sup>-module. Hence if <sup>ϕ</sup> <sup>∈</sup> *<sup>V</sup>*, then *<sup>X</sup>*<sup>0</sup> · <sup>ϕ</sup> <sup>∈</sup> *<sup>V</sup>*. Since <sup>ϕ</sup> <sup>∈</sup> *<sup>Z</sup>*2(g,R), the first term in (5.258) vanishes, whilst the second term lies in *<sup>B</sup>*2(g,R). Since *<sup>V</sup>* <sup>∩</sup>*B*2(g,R) = {0}, we therefore have *<sup>X</sup>*<sup>0</sup> · <sup>ϕ</sup> <sup>=</sup> 0, and hence <sup>δ</sup>(1) ϕ(*X*0,−)(*X*1,*X*2) = 0, which gives <sup>ϕ</sup>(*X*0,[*X*1,*X*2]) = 0, for all *<sup>X</sup>*0,*X*1,*X*<sup>2</sup> <sup>∈</sup> <sup>g</sup>. At this point we use another implication of the semi-simplicity of g, namely [g,g] = g. It follows that <sup>ϕ</sup> <sup>=</sup> 0, whence *<sup>V</sup>* <sup>=</sup> {0}, from which *<sup>Z</sup>*2(g,R) = *<sup>B</sup>*2(g,R), or, in other words, *<sup>H</sup>*2(g,R) = 0. -

Theorem 5.55. *Let G be a connected and simply connected Lie group. Then*

$$H^2(G, \mathbb{T}) \cong H^2(\mathfrak{g}, \mathbb{R}). \tag{5.259}$$

*Proof.* This is really a conjunction of two isomorphisms:

$$H^2(G, \mathbb{T}) \cong H^2(G, \mathbb{R});\tag{5.260}$$

$$H^2(G, \mathbb{R}) \cong H^2(\mathfrak{g}, \mathbb{R}),\tag{5.261}$$

where R is the usual additive group, and *Z*2(*G*,R), *B*2(*G*,R), and hence *H*2(*G*,R) are defined analogously to *Z*2(*G*,T) etc. The first isomorphism is simply induced by

$$Z^2(G, \mathbb{R}) \mapsto Z^2(G, \mathbb{T});\tag{5.262}$$

$$
\Gamma(\mathbf{x}, \mathbf{y}) \mapsto e^{\varPi^{\Gamma}(\mathbf{x}, \mathbf{y})} \equiv c(\mathbf{x}, \mathbf{y}), \tag{5.263}
$$

which preserves exactness and induces an isomorphism in cohomology (but note that (5.262) - (5.263) may not itself define an isomorphism).

The isomorphism (5.261) is induced at the cochain level, too. Given a cocycle <sup>ϕ</sup> <sup>∈</sup> *<sup>Z</sup>*2(*G*,R), we construct a new Lie algebra <sup>g</sup><sup>ϕ</sup> (called a *central extension* of <sup>g</sup>) by taking <sup>g</sup><sup>ϕ</sup> <sup>=</sup> <sup>g</sup>⊕<sup>R</sup> as a vector space, equipped though with the unusual bracket

$$[(X, \nu), (Y, \nu)] = ([X, Y], \Phi(X, Y));\tag{5.264}$$

the condition <sup>ϕ</sup> <sup>∈</sup> *<sup>Z</sup>*2(*G*,R) guarantees that this is a Lie bracket. Furthermore, <sup>g</sup><sup>ϕ</sup> is isomorphic (as a Lie algebra) to a direct sum iff <sup>ϕ</sup> <sup>∈</sup> *<sup>B</sup>*2(g,R); indeed, if (5.253) holds, then (*X*, *<sup>v</sup>*) → (*X*, *<sup>v</sup>*+θ(*X*)) yields the desired isomorphism <sup>g</sup><sup>ϕ</sup> <sup>→</sup> <sup>g</sup>⊕R.

By Lie's Third Theorem, there is a connected and simply connected Lie group *<sup>G</sup>*<sup>ϕ</sup> (again called a *central extension* of *<sup>G</sup>*), with Lie algebra <sup>g</sup>ϕ, As a manifold, *G*<sup>ϕ</sup> = *G*×R, but the group laws are given, in terms of a function Γ : *G*×*G* → R, by

$$(\mathbf{x}, \nu) \cdot (\mathbf{y}, \nu) = (\mathbf{x}\mathbf{y}, \nu + \boldsymbol{w} + \boldsymbol{\Gamma}(\mathbf{x}, \mathbf{y}));\tag{5.265}$$

$$(\mathbf{x}, \nu)^{-1} = (\mathbf{x}^{-1}, -\nu - \Gamma(\mathbf{x}, \mathbf{x}^{-1})).\tag{5.266}$$

The group axioms then imply (indeed, they are equivalent to) the condition Γ ∈ *Z*2(*G*,R). Furthermore, two such extensions *G*<sup>ϕ</sup> and *G* <sup>ϕ</sup> are isomorphic iff the corresponding cocycles <sup>Γ</sup> and <sup>Γ</sup> are related by (5.244), and in particular, <sup>Γ</sup> <sup>∈</sup> *<sup>B</sup>*2(*G*,R) iff *G*<sup>ϕ</sup> is isomorphic (as a Lie group) to a direct product *G*×R, which in turn is the case iff <sup>ϕ</sup> <sup>∈</sup> *<sup>B</sup>*2(g,R). Conversely, given <sup>Γ</sup> <sup>∈</sup> *<sup>Z</sup>*2(*G*,R), we define the central extension *<sup>G</sup>*<sup>ϕ</sup> by (5.265) - (5.266), to find that the associated Lie algebra <sup>g</sup><sup>ϕ</sup> takes the above form, defining <sup>ϕ</sup> <sup>∈</sup> *<sup>B</sup>*2(g,R) through (5.264). Explicitly,

$$\Phi(X,Y) = \frac{d}{ds}\frac{d}{dt}\left[\Gamma\left(e^{tX}, e^{sY}\right)\right]\_{|s=t=0} - (X \leftrightarrow Y). \tag{5.267}$$

Lie's Third Theorem thus implies that the map ϕ ↔ Γ (which is not necessarily a bijection) descends to an isomorphism *<sup>H</sup>*2(g,R) <sup>→</sup> *<sup>H</sup>*2(*G*,R) in cohomology. -

Given (5.254), Theorem 5.55 immediately gives

$$H^2(\mathbb{R}, \mathbb{T}) = 0.\tag{5.268}$$

In particular, if R is the relevant symmetry group, which is the case e.g. with time translation, by Proposition 5.53 we may restrict ourselves to unitary representations.

Once again, this has nothing to do with abelianness or topological triviality of R. Indeed, for *<sup>G</sup>* <sup>=</sup> g <sup>=</sup> <sup>R</sup>2, the *Heisenberg cocycle* (5.255) comes from the multiplier

$$c\_0((p,q),(p',q')) = e^{i(pq'-qp')/2},\tag{5.269}$$

where R<sup>2</sup> is seen as the group of translations in the *phase space* R<sup>2</sup> of a particle moving on R. Accordingly, this multiplier is realized by the following projective representation of R<sup>2</sup> on *L*2(R):

$$
\mu(p,q)\Psi(\mathbf{x}) = e^{-ipq/2}e^{i\mathbf{x}p}\Psi(\mathbf{x}-q). \tag{5.270}
$$

If R<sup>2</sup> is the *configuration space* of some particle, and the group R<sup>2</sup> produces translations in the latter (i.e., of *position*), then the appropriate unitary representation would rather be on *L*2(R2) and would have trivial multiplier, viz.

$$
\mu(q\_1, q\_2)\Psi(\mathbf{x}\_1, \mathbf{x}\_2) = \Psi(\mathbf{x}\_1 - q\_1, \mathbf{x}\_2 - q\_2). \tag{5.271}
$$

Similarly, *G* = R2, now seen as generating translations of *momentum* in the phase space R<sup>4</sup> of the latter example would appropriately be represented on *L*2(R2) as

$$
\mu(q\_1, q\_2)\Psi(\mathbf{x}\_1, \mathbf{x}\_2) = e^{i(\mathbf{x}\_1q\_1 + \mathbf{x}\_2q\_2)}\Psi(\mathbf{x}\_1, \mathbf{x}\_2). \tag{5.272}
$$

Corollary 5.56. *Let G be a connected and simply connected semi-simple Lie group. Then H*2(*G*,T) *is trivial.*

Here we say that a Lie group is *simple* when it has no proper *connected* normal subgroups, and *semi-simple* if it has no proper connected normal *abelian* subgroups. For example, the "classical Lie groups" of Weyl are semi-simple, including *SO*(3) and *SU*(2), which are even simple (note that the latter does have a *discrete* normal subgroup, namely its center {±12} ∼= Z2). Also, products of simple Lie groups are semi-simple. However, Corollary 5.56 does not apply to *SO*(3), which is semisimple but not simply connected. Here the relevant general result is:

Theorem 5.57. *Let G be a connected Lie group with H*2(g,R) = <sup>0</sup>*. Then*

$$H^2(G, \mathbb{T}) \cong \widehat{\pi\_1(G)}.\tag{5.273}$$

We need some background (cf. §C.15). For any abelian (topological) group *A*, the set

$$
\hat{A} = \text{Hom}(A, \mathbb{T}) \tag{5.274}
$$

consists of all (continuous) homomorphisms (also called *characters*) χ : *A* → T; these are just the irreducible (and hence necessarily one-dimensional) unitary representations of *A*. This set is a group under the obvious pointwise operations

$$
\mathcal{X}1\mathcal{X}2(a) = \mathcal{X}1(a)\mathcal{X}2(a);\tag{5.275}
$$

$$\mathcal{X}^{-1}(a) = \mathcal{X}(a)^{-1}.\tag{5.276}$$

As such, the group *A*ˆ is called the *(Pontryagin) dual* of *A*; the *Pontryagin Duality Theorem* states that <sup>ˆ</sup> *A*ˆ ∼= *A*. Using Theorem 5.57 and Theorem 5.41, this gives

$$H^2(SO(3), \mathbb{T}) = \mathbb{Z}\_2. \tag{5.277}$$

We now use Theorem 5.41 as a lemma to prove Theorem 5.57:

*Proof.* We first state the map π 1(*G*) <sup>→</sup> *<sup>H</sup>*2(*G*,T) that will turn out to be an isomorphism. Assuming Theorem 5.41, pick a (Borel measurable) cross-section

$$
\tilde{s}: G \to \tilde{G} \tag{5.278}
$$

of the canonical projection

$$
\tilde{\mathfrak{A}} : \tilde{G} \to G = \tilde{G}/D. \tag{5.279}
$$

As always, this means that π˜ ◦ *s*˜ = id*G*, and ˜*s* is supposed to be smooth near the identity, and chosen such that ˜*s*(*eG*) = *eG*˜, where *eG* and *eG*˜ are the unit elements of *<sup>G</sup>* and *<sup>G</sup>*˜, respectively. Given a character <sup>χ</sup> <sup>∈</sup> <sup>π</sup> 1(*G*), define *<sup>c</sup>*<sup>χ</sup> : *<sup>G</sup>*×*<sup>G</sup>* <sup>→</sup> <sup>T</sup> by

$$c\_{\mathcal{Z}}(\mathbf{x}, \mathbf{y}) = \mathcal{Z}(\tilde{\mathbf{s}}(\mathbf{x})\tilde{\mathbf{s}}(\mathbf{y})\tilde{\mathbf{s}}(\mathbf{x}\mathbf{y})^{-1}).\tag{5.280}$$

This makes sense: π˜ is a homomorphism, so that (cf. the computation below (5.238))

$$\mathfrak{A}(\mathfrak{s}(\mathfrak{x})\mathfrak{s}(\mathfrak{y})\mathfrak{s}(\mathfrak{x}\mathfrak{y})^{-1}) = \mathfrak{A}(\mathfrak{s}(\mathfrak{x}))\mathfrak{A}(\mathfrak{s}(\mathfrak{y})) \mathfrak{A}(\mathfrak{s}(\mathfrak{x}\mathfrak{y}))^{-1} = \mathfrak{x}\mathfrak{y}(\mathfrak{x}\mathfrak{y})^{-1} = e\_G,$$

and hence ˜*s*(*x*)*s*˜(*y*)*s*˜(*xy*)−1) <sup>∈</sup> ker(π˜) = *<sup>D</sup>* (where we identify *<sup>D</sup>* with <sup>π</sup>1(*G*), cf. Theorem 5.41). Furthermore, tedious computations show that (5.240) and (5.241) hold, so that *<sup>c</sup>*<sup>χ</sup> <sup>∈</sup> *<sup>Z</sup>*2(*G*,T). Different choices of ˜*<sup>s</sup>* lead to equivalent 2-cocycles *<sup>c</sup>*, and hence by taking the cohomology class [*c*<sup>χ</sup> ] of *c*<sup>χ</sup> we obtain an injective map

$$
\bar{\pi}\_{\mathbb{T}}(\bar{G}) \to H^2(G, \mathbb{T});\tag{5.281}
$$

$$
\mathcal{X} \mapsto [c\_{\mathcal{X}}].\tag{5.282}
$$

To prove surjectivity of this map, let *<sup>c</sup>* <sup>∈</sup> *<sup>Z</sup>*2(*G*,T) and define ˜*<sup>c</sup>* : *<sup>G</sup>*˜ <sup>×</sup>*G*˜ <sup>→</sup> <sup>T</sup> by

$$
\tilde{c}(\tilde{\mathbf{x}}, \tilde{\mathbf{y}}) = c(\tilde{\pi}(\mathbf{x}), \tilde{\pi}(\mathbf{y})).\tag{5.283}
$$

Conversely, we may recover *<sup>c</sup>* from ˜*<sup>c</sup>* and some cross-section ˜*<sup>s</sup>* : *<sup>G</sup>* <sup>→</sup> *<sup>G</sup>*˜ of <sup>π</sup>˜ by

$$c(\mathbf{x}, \mathbf{y}) = \tilde{c}(\tilde{s}(\mathbf{x}), \tilde{s}(\mathbf{y})). \tag{5.284}$$

It follows that ˜*<sup>c</sup>* <sup>∈</sup> *<sup>Z</sup>*2(*G*˜,T). Theorem 5.55 implies that *<sup>H</sup>*2(*G*˜,T) is trivial, so that

$$
\tilde{c}(\tilde{\mathbf{x}}, \tilde{\mathbf{y}}) = \tilde{b}(\tilde{\mathbf{x}}\tilde{\mathbf{y}}) / \tilde{b}(\tilde{\mathbf{x}}) \tilde{b}(\tilde{\mathbf{y}}), \tag{5.285}
$$

for some function *<sup>b</sup>*˜ : *<sup>G</sup>*˜ <sup>→</sup> <sup>T</sup> satisfying *<sup>b</sup>*˜(*e*˜) = 1. From (5.241), i.e., *<sup>c</sup>*(*e*, *<sup>x</sup>*) = 1, we infer that if ˜*x* = δ ∈ *D*, so that π˜(δ) = *e*, then ˜*c*(δ, *y*˜) = 1, and hence

174 5 Symmetry in quantum mechanics

$$
\tilde{b}(\delta \vec{\mathbf{y}}) = \tilde{b}(\delta)\tilde{b}(\vec{\mathbf{y}}).\tag{5.286}
$$

Taking ˜*x* and ˜*y* both in *D*, we see that *b*˜ <sup>|</sup>*<sup>D</sup>* is a character, which we call χ. Hence

$$\begin{split} c(\mathbf{x}, \mathbf{y}) &= \frac{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x})\tilde{\mathbf{s}}(\mathbf{y}))}{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x}))\tilde{b}(\tilde{\mathbf{s}}(\mathbf{y}))} = \frac{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x}\mathbf{y}))}{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x}))\tilde{b}(\tilde{\mathbf{s}}(\mathbf{y}))} \cdot \frac{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x})\tilde{\mathbf{s}}(\mathbf{y}))}{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x}\mathbf{y}))} \\ &= \frac{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x}\mathbf{y}))}{\tilde{b}(\tilde{\mathbf{s}}(\mathbf{x}))\tilde{b}(\tilde{\mathbf{s}}(\mathbf{y}))} \cdot c\_{\mathcal{X}}(\mathbf{x}, \mathbf{y}), \end{split} \tag{5.287}$$

since, using (5.286) with δ *s*˜(*x*)*s*˜(*y*)*s*˜(*xy*)−<sup>1</sup> and ˜*y s*˜(*xy*), we have

$$\begin{split} \frac{\tilde{b}(\tilde{s}(\mathbf{x})\tilde{s}(\mathbf{y}))}{\tilde{b}(\tilde{s}(\mathbf{x}\mathbf{y}))} &= \frac{\tilde{b}(\tilde{s}(\mathbf{x})\tilde{s}(\mathbf{y})\tilde{s}(\mathbf{x}\mathbf{y})^{-1}\tilde{s}(\mathbf{x}\mathbf{y}))}{\tilde{b}(\tilde{s}(\mathbf{x}\mathbf{y}))} = \tilde{b}(\tilde{s}(\mathbf{x})\tilde{s}(\mathbf{y})\tilde{s}(\mathbf{x}\mathbf{y})^{-1}), \\ &= \mathcal{X}(\tilde{s}(\mathbf{x})\tilde{s}(\mathbf{y})\tilde{s}(\mathbf{x}\mathbf{y})^{-1}) = c\_{\mathcal{X}}(\mathbf{x}, \mathbf{y}). \end{split}$$

Thus [*c*]=[*c*<sup>χ</sup> ], and hence the map (5.281) - (5.282) is surjective. -

Definition 5.58. *In the situation and notation of Theorem 5.41, a unitary representation <sup>u</sup>*˜ : *<sup>G</sup>*˜ <sup>→</sup> *<sup>U</sup>*(*H*) *is called* admissible *if <sup>u</sup>*˜(*D*) <sup>⊂</sup> <sup>T</sup>· <sup>1</sup>*H.*

In that case, there is obviously a character <sup>χ</sup> <sup>∈</sup> *<sup>D</sup>*<sup>ˆ</sup> such that for each <sup>δ</sup> <sup>∈</sup> *<sup>D</sup>* we have

$$
\tilde{\mu}(\boldsymbol{\delta}) = \mathcal{X}(\boldsymbol{\delta}) \cdot \mathbf{1}\_H. \tag{5.288}
$$

Unitary irreducible representations are admissible, since Schur's Lemma implies that, since *D* lies in the center of *G*˜, its image ˜*u*(*D*) consists of multiples of the unit.

If ˜*u* is admissible, we obtain a homomorphism (5.231) by means of

$$h = \mathfrak{x} \circ \mathfrak{u} \circ \mathfrak{s},\tag{5.289}$$

where ˜*s* is any cross-section of π˜, cf. (5.278) - (5.279). Note that different choices *s*˜,*s*˜ are related by ˜ *s* (*x*) = *s*˜(*x*)δ(*x*), where δ : *G* → *D* is some function, so that

$$h'(\mathbf{x}) = \pi(\tilde{u}(\tilde{s}'(\mathbf{x}))) = \pi(\tilde{u}(\tilde{s}(\mathbf{x}))\tilde{u}(\mathcal{S}(\mathbf{x}))) = \pi(\tilde{u}(\tilde{s}(\mathbf{x})))\pi(\mathcal{S}(\mathbf{x}) \cdot 1\_H) = h(\mathbf{x}).$$


*Proof.* Given such a homomorphism *h*, pick a cross-section *s*:*U*(*H*)/T →*U*(*H*), as in (5.234), with associated 2-cocycle *c* on *G* given by (5.239). By Theorem 5.57 and its proof, we may assume (possibly after redefining *s*) that there exists a character <sup>χ</sup> <sup>∈</sup> *<sup>D</sup>*<sup>ˆ</sup> and a cross-section (5.278) such that *<sup>c</sup>* <sup>=</sup> *<sup>c</sup>*<sup>χ</sup> , cf. (5.280). We then define

5.10 Symmetry groups and projective representations 175

$$
\tilde{\mu}: \tilde{G} \to B(H);
\tag{5.290}
$$

$$\tilde{\mathbf{x}} \mapsto \mathbf{x} (\tilde{\mathbf{x}} \cdot (\tilde{\mathbf{s}} \odot \tilde{\mathbf{x}}(\tilde{\mathbf{x}}))^{-1}) \boldsymbol{\mu} (\tilde{\mathbf{x}}(\tilde{\mathbf{x}})).\tag{5.291}$$

Simple computations then show that ˜*<sup>x</sup>* ·(*s*˜◦ <sup>π</sup>˜(*x*˜))−<sup>1</sup> <sup>∈</sup> *<sup>D</sup>* (i.e., the center of *<sup>G</sup>*˜), that (5.288) holds, that each operator ˜*u*(*x*˜) is unitary, that the group homomorphism properties ˜*u*(*x*˜)*u*˜(*y*˜) = *u*˜(*x*˜*y*˜) and ˜*u*(*e*˜) = 1*<sup>H</sup>* hold, and that (5.289) is valid. As to the last equation, since π removes the term with χ in (5.291), and *u* = *s* ◦ *h*, we have

$$
\mathfrak{a} \circ \mathfrak{i} \circ \mathfrak{s}(\mathfrak{x}) = \mathfrak{x} \circ \mathfrak{s} \circ h \circ \mathfrak{A} \circ \mathfrak{s}(\mathfrak{x}) = h(\mathfrak{x}),
$$

since π ◦ *s* = id (on *U*(*H*)/T) and π˜ ◦ *s*˜= id (on *G*).

If ˜*u*(δ) = 1*<sup>H</sup>* for each δ ∈ *D*, then *c*<sup>χ</sup> = 1 from (5.280), so that *u*(*x*)*u*(*y*) = *uxy* by (5.238). If *s* preserves units, or, equivalently, if *he* = 1*H*, as we always assume, we see that *u* is a unitary representation of *G*. In this case, (5.291) simply reads *u*˜ = *s* ◦ *h* ◦ π˜. This immediately yields ˜*u* = *u* ◦ π˜, which in turn gives *u* = *u*˜ ◦ *s*˜.

Finally, even if *h* is continuous, it is *a priori* unclear if ˜*u* is, since the crosssections *s* and ˜*s* appearing in the above construction typically fail to be continuous. Fortunately, since they are assumed measurable, there is no question about measurability of ˜*u*, and if *H* is separable, continuity follows from Proposition 5.36. -

Corollary 5.60. *If G is a connected Lie group with covering group G, the formulae* ˜

$$
\tilde{\mu} = \mu \circ \tilde{\pi};\tag{5.292}
$$

$$
\mu = \tilde{\mu} \diamond \tilde{\mathfrak{s}},\tag{5.293}
$$

*where <sup>s</sup>*˜: *<sup>G</sup>* <sup>→</sup> *G is any cross-section of the covering map* ˜ <sup>π</sup>˜ : *<sup>G</sup>*˜ <sup>→</sup> *G, give a bijective correspondence between (continuous) super-admissible unitary representations u of* ˜ *G and (continuous) unitary representations u of G, preserving irreducibility.* ˜

Corollary 5.61. *Any homomorphism h* : *SO*(3) →*U*(*H*)/T *as in* (5.231) *comes from an admissible unitary representation u of SU* ˜ (2) *by* (5.289)*. Moreover, h comes from a unitary representation u* = *u*˜ ◦ *s of SO* ˜ (3) *itself iff u is trivial on the center* ˜ Z2*.*

*In particular, if h is irreducible, it must come from the unitary irreducible representations u*˜ = *Dj, where j* = 0, <sup>1</sup> <sup>2</sup> ,1,... *is the (half-) integer* spin *label. Then Dj*(*SU*(2)) *is super-admissible iff j is integral, in which case it defines a unitary irreducible representation of SO*(3)*.*

Indeed, the assumption *<sup>H</sup>*2(g,R) = 0 in Theorem 5.59 is satisfied for *SO*(3) because of Whitehead's Lemma 5.54. The case where *<sup>H</sup>*2(g,R) <sup>=</sup> 0 occurs e.g. for the Galilei group (cf. §7.6). It can be shown that *<sup>H</sup>*2(g,R) has finitely many generators, for which one finds pre-images (ϕ1,...,ϕ*M*) in *<sup>Z</sup>*2(g,R), with corresponding elements (Γ1,...,Γ*M*) of *Z*2(*G*˜,R), cf. the proof of Theorem 5.55. Of these, a subset (Γ1,...,Γ*N*), *N* ≤ *M*, satisfies the relation Γ*i*(δ,*x*˜) = Γ*i*(*x*˜,δ) for any δ ∈ *D* (cf. Theorem 5.41) and ˜*<sup>x</sup>* <sup>∈</sup> *<sup>G</sup>*˜. This yields a map <sup>Γ</sup> : *<sup>G</sup>*˜ <sup>×</sup> *<sup>G</sup>*˜ <sup>→</sup> <sup>R</sup>*<sup>N</sup>* given by Γ (*x*˜, *y*˜)=(Γ1(*x*˜, *y*˜),...,Γ*N*(*x*˜, *y*˜)), which in turn equips the set

$$
\check{G} = \tilde{G} \times \mathbb{R}^N,\tag{5.294}
$$

with a group multiplication (*x*˜, *v*)·(*y*˜,*w*)=(*x*˜*y*˜, *v* + *w* +Γ (*x*˜,*y*˜)). We then have the following generalization of Theorem 5.59, in which a unitary representation *u* of *G*ˇ is called *admissible* if *<sup>u</sup>*(δ, *<sup>v</sup>*) <sup>∈</sup> <sup>T</sup>· <sup>1</sup>*<sup>H</sup>* for any <sup>δ</sup> <sup>∈</sup> *<sup>D</sup>* and *<sup>v</sup>* <sup>∈</sup> <sup>R</sup>*N*.

Theorem 5.62. *Let G be a connected Lie group, and H a separable Hilbert space. Then any continuous homomorphism h* : *G* →*U*(*H*)/T *comes from some admissible continuous unitary representation u of* ˜ *G.* ˇ

As we only apply this to the Galilei group (where *N* = 1), basically only for illustrative purposes, we omit the proof. The correct (and natural) notion of equivalence of projective representations is as follows: we say that two such homomorphisms *hi* : *G* →*U*(*Hi*)/T, *i* = 1,2 are *equivalent* if there is a unitary *w* : *H*<sup>1</sup> → *H*<sup>2</sup> such that

$$\operatorname{Ad}\_{\mathbb{W}}(h\_1(\mathbf{x})) = h\_2(\mathbf{x}), \; \mathbf{x} \in G,\tag{5.295}$$

where Ad*<sup>w</sup>* : *U*(*H*1)/T → *U*(*H*2)/T is the map [*u*] → [*vuv*∗], which is well defined (here [*u*] is the equivalence class of *u* ∈ *U*(*H*) in *U*(*H*)/T under *u* ∼ *zu*, *z* ∈ T).

This induces the following notion for *G*ˇ: two admissible unitary representations *<sup>u</sup>*˜1,*u*˜2 of *<sup>G</sup>*˜ on Hilbert spaces *<sup>H</sup>*1,*H*<sup>2</sup> are *equivalent* if there is a unitary *<sup>w</sup>* : *<sup>H</sup>*<sup>1</sup> <sup>→</sup> *<sup>H</sup>*<sup>2</sup> *and* a map *<sup>b</sup>* : *<sup>G</sup>*<sup>ˇ</sup> <sup>→</sup> <sup>T</sup> such that *wu*1(*x*ˇ)*w*<sup>∗</sup> <sup>=</sup> *<sup>b</sup>*(*x*ˇ)*u*2(*x*ˇ), for any ˇ*<sup>x</sup>* <sup>∈</sup> *<sup>G</sup>*ˇ. It can be shown that such a map *<sup>b</sup>* always comes from a character <sup>χ</sup> : *<sup>G</sup>*˜ <sup>→</sup> <sup>T</sup> through *<sup>b</sup>*(*x*˜, *<sup>v</sup>*) = <sup>χ</sup>(*x*˜).

To close this long and difficult section, in relief it should be mentioned that the above theory vastly simplifies if *H* is finite-dimensional. By Theorem 5.40, this is true, for example, if *G* is compact and *u* is irreducible. Suppose *u* : *G* → *U*(*H*) is merely a projective unitary representation of *G*, so that instead of (5.157) one has

$$[u'(X), u'(Y)] = u'([X, Y]) + i\varphi(X, Y) \cdot 1\_H,\tag{5.296}$$

where ϕ is given by (5.267). Taking the trace yields

$$\varphi(X,Y) = \frac{i}{n} \text{Tr}\left(\mu'([X,Y])\right),\tag{5.297}$$

where *<sup>n</sup>* <sup>=</sup> dim(*H*) <sup>&</sup>lt; <sup>∞</sup>. We may define a linear function <sup>θ</sup> : g <sup>→</sup> <sup>R</sup> by

$$\Theta(X) = \frac{i}{n} \text{Tr}\left(\mu'(X)\right),\tag{5.298}$$

so that ϕ(*X*,*Y*) = θ([*X*,*Y*]), cf. (5.253), and hence we may remove ϕ by redefining

$$
\tilde{\mu}'(X) = \mu'(X) + i\theta(X) \cdot 1\_H,\tag{5.299}
$$

which satisfies (5.157) - (5.158). Hence by Corollary 5.43 the map ˜*u* exponentiates to a unitary representation ˜*u* of the universal covering group *G*˜ of *G*; it should be checked from the values of ˜*u* on *D* if ˜*u* also defines a unitary representation of *G*. This argument shows that *finite-dimensional* projective unitary representations of Lie groups always come from unitary representations of the covering group.

#### 5.11 Position, momentum, and free Hamiltonian

The three basic operators of non-relativistic quantum mechanics are position, denoted *q*, momentum, *p*, and the free Hamiltonian *h*0. Assuming for simplicity that the particle moves in one dimension, these are informally given on *H* = *L*2(R) by

$$q\,\Psi(\mathbf{x}) = \mathbf{x}\Psi(\mathbf{x});\tag{5.300}$$

$$p\,\Psi(\mathbf{x}) = -i\hbar \frac{d}{d\mathbf{x}}\Psi(\mathbf{x});\tag{5.301}$$

$$h\_0 \Psi(\mathbf{x}) = -\frac{\hbar^2}{2m} \frac{d^2}{d\mathbf{x}^2} \Psi(\mathbf{x}),\tag{5.302}$$

where *m* is the mass of the particle under consideration. We put *h*¯ = 1 and *m* = 1/2.

The issue is that these operators are unbounded; see §B.13. In general, quantummechanical observables are supposed to be represented by self-adjoint operators, and examples like (5.300) - (5.302) show that these may not be bounded. The Hellinger–Toeplitz Theorem B.68 then shows that it makes no sense to try and extend the above expressions to all of *L*2(R), so we have to live with the fact that some crucial operators *a* : *D*(*a*) → *H* are merely defined on a dense subspace *D*(*a*) ⊂ *H*.

Each such operator has an *adjoint a*<sup>∗</sup> : *D*(*a*∗) → *H*, whose domain *D*(*a*∗) ⊂ *H* consists of all ψ ∈ *H* for which the functional ϕ → ψ,*a*ϕ is bounded on *D*(*a*), and hence (since *D*(*a*) is dense in *H*) can be extended to all of *H* by continuity through the unique "Riesz–Frechet vector" ´ χ for which ψ,*a*ϕ = χ,ϕ. Writing χ = *a*∗ψ, for each ψ ∈ *D*(*a*∗) and ϕ ∈ *D*(*a*) we therefore have

$$
\langle a^\* \Psi, \Phi \rangle = \langle \Psi, a\Phi \rangle. \tag{5.303}
$$

Assuming that *D*(*a*) is dense in *H*, we say that *a* is *self-adjoint*, written *a*∗ = *a*, if

$$
\langle a\mathfrak{p}, \mathfrak{y} \rangle = \langle \mathfrak{p}, a\mathfrak{y} \rangle,\tag{5.304}
$$

for each ψ,ϕ ∈ *D*(*a*) *and D*(*a*∗) = *D*(*a*). A self-adjoint operator *a* is automatically *closed*, in that its graph *G*(*a*) = {(ψ,*a*ψ) | ψ ∈ *D*(*a*)} is a closed subspace of the Hilbert space *H* ⊕*H* (indeed, the adjoint of any densely defined operator is closed, see Proposition B.72). In practice, self-adjoint operators often arise as closures of *essentially self-adjoint* operators *a*, which by definition satisfy *a*∗∗ = *a*∗. Equivalently, such an operator is *closable*, in that the closure of its graph is the graph of some (uniquely defined) operator, called the *closure a*− of *a*, and furthermore this closure is self-adjoint, so that *a*− = *a*∗. If *a* is closable, the domain *D*(*a*−) of its closure consists of all ψ ∈ *H* for which there exists a sequence (ψ*n*) in *D*(*a*) such that ψ*<sup>n</sup>* → ψ *and a*ψ*<sup>n</sup>* converges, on which we define *a*<sup>−</sup> by *a*−ψ = lim*<sup>n</sup> a*ψ*n*.

The simplest case is the position operator.

Theorem 5.63. *The operator q is self-adjoint on the domain*

$$D(q) = \{ \Psi \in L^2(\mathbb{R}) \mid \int\_{\mathbb{R}} dx \mathbf{x}^2 |\Psi(\mathbf{x})|^2 < \ast \}. \tag{5.305}$$

See Proposition B.73 for the proof. To give a convenient domain of essential selfadjointness (also for the other two operators), we need a little distribution theory.

Definition 5.64. *The* Schwartz space S (R) *(whose elements are* functions of rapid decrease*) consist of all smooth function f* : R → C *for which each expression*

$$||f||\_{n,m} = \sup\{|\mathbf{x}^n f^{(m)}(\mathbf{x})|, \mathbf{x} \in \mathbb{R}\},\tag{5.306}$$

*where f* (*m*) *is the m'th derivative of f , is finite. The topology of* S (R) *is given by saying that a sequence (or net) f*<sup>λ</sup> *converges to f iff f*<sup>λ</sup> − *f n*,*<sup>m</sup>* → 0 *for all n*,*m* ∈ N*.*

Each ·*n*,*<sup>m</sup>* happens to be a norm, but positive definiteness is nowhere used in the theory below (which therefore works for families of *seminorms*, which satisfy the axioms of a norm expect perhaps for positive definiteness). Since there are countably many such (semi)norms defining the topology, we may equivalently say that S (R) is a metric space defined by

$$d(f, \mathbf{g}) = \sum\_{n,m=0}^{\infty} 2^{-n} \frac{||f - \mathbf{g}||\_{n,m}}{1 + ||f - \mathbf{g}||\_{n,m}}.\tag{5.307}$$

Indeed, <sup>S</sup> (R) is complete in this metric. A typical element is *<sup>f</sup>*(*x*) = exp(−*x*2).

Definition 5.65. *A* tempered distribution *is a continuous linear map* ϕ : S (R) → C*. The space of all such maps, equipped with the topology of pointwise convergence (i.e.,* ϕλ → ϕ *iff* ϕλ (*f*) → ϕ(*f*) *for each f* ∈ S (R)*) is denoted by* S (R)*.*

It can be shown that (because of the metrizability of S (R)) continuity is the same as sequential continuity, i.e., some linear map ϕ : S (R) → C belongs to S (R) iff lim*<sup>N</sup>* ϕ(*fN*) = ϕ(*f*) for each convergent *sequence fN* → *f* in S (R). Like S (R), the tempered distributions S (R) form a (locally convex) *topological vector space*, that is, a vector space with a topology in which addition and scalar multiplication are continuous. The topology of S (R) is given by a family of seminorms, namely ϕ*<sup>f</sup>* = |ϕ(*f*)|, *f* ∈ S (R), and hence a simple way to prove that ϕ ∈ S (R) is to find some (*n*,*m*) for which |ϕ(*f*))| ≤*C f n*,*<sup>m</sup>* for each *f* ∈ S (R), since in that case *fN* → *f* , which means that *fN* − *f n*,*<sup>m</sup>* → 0 for *all n*,*m* ∈ N, certainly implies that ϕ(*fN*) → ϕ(*f*), so that ϕ is continuous. For example, the evaluation maps δ*<sup>x</sup>* defined by δ*x*(*f*) = *f*(*x*) are continuous (take *n* = *m* = 0). Similarly, each finite measure on R defines a tempered distribution. Taking the (0,*m*) seminorm shows that the maps *<sup>f</sup>* → *<sup>f</sup>* (*m*) (*x*) for fixed *m* ∈ N and *x* ∈ R are tempered distributions.

A less obvious example (defining a so-called *Gelfand triple*) is as follows:

Proposition 5.66. *We have continuous dense inclusions*

$$\mathcal{J}\mathcal{J}'(\mathbb{R}) \subset L^2(\mathbb{R}) \subset \mathcal{J}'(\mathbb{R}),\tag{5.308}$$

*where the second inclusion identifies* <sup>ϕ</sup> <sup>∈</sup> *<sup>L</sup>*2(R) *with the map*

$$f \mapsto \langle \overline{\!\!\! } \!\!f \!\!\!/ , f \rangle = \int\_{\mathbb{R}} dx \, \!\!\!\!/ \!\!x \, \!\!\!/ (x) . \tag{5.309}$$

*Proof.* As vector spaces, the first inclusion is obvious. For *f* ∈ S (R) we estimate

$$\|f\|\_{2}^{2} = \int\_{\mathbb{R}} d\mathbf{x} \, |f(\mathbf{x})| \cdot |f(\mathbf{x})| \le \|f\|\_{1} \|f\|\_{\infty};\tag{5.310}$$

$$\|f\|\_{1} = \int\_{\mathbb{R}} d\mathbf{x} \, \frac{(1+\mathbf{x}^{2})|f(\mathbf{x})|}{1+\mathbf{x}^{2}} \le \int\_{\mathbb{R}} d\mathbf{y} \, \frac{1}{1+\mathbf{y}^{2}} \, \|(1+m\_{\mathbf{x}^{2}})f\|\_{\infty}$$

$$\leq \pi(\|f\|\_{0,0} + \|f\|\_{2,0}),\tag{5.311}$$

so that, noting that ·0,<sup>0</sup> = ·∞, we have

$$\|f\|\_{2}^{2} \le \pi(\|f\|\_{\circ} + \|f\|\_{2,0})\|f\|\_{\circ}.\tag{5.312}$$

Hence *f*<sup>λ</sup> → *f* in S (R), which incorporates the conditions *f*<sup>λ</sup> − *f* 0,<sup>0</sup> → 0 and *f*<sup>λ</sup> − *f* 2,<sup>0</sup> → 0, implies *f*<sup>λ</sup> − *f* <sup>2</sup> → 0. This shows that the first inclusion in (5.308) is continuous. Density may be proved in two steps. First, take some fixed positive function *<sup>h</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (−1,1) with the property *dxh*(*x*) = 1, and define *hn*(*x*) = *nh*(*nx*), so that informally *hn* <sup>∈</sup>*C*<sup>∞</sup> *<sup>c</sup>* (R) converges to a δ-function as *n* → ∞. For each <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R), we consider the convolution *hn* <sup>∗</sup>ψ, where for suitable *<sup>f</sup>*,*g*,

$$f \* \mathbf{g}(\mathbf{x}) \equiv \int\_{\mathbb{R}} d\mathbf{y} \, f(\mathbf{x} - \mathbf{y}) \mathbf{g}(\mathbf{y}).\tag{5.313}$$

Then *hn* <sup>∗</sup><sup>ψ</sup> <sup>∈</sup> *<sup>C</sup>*∞(R)∩*L*2(R) and, from elementary analysis, *hn* <sup>∗</sup><sup>ψ</sup> <sup>−</sup>ψ → 0.

Second, for <sup>ψ</sup> <sup>∈</sup> *Cc*(R), the functions *hn* <sup>∗</sup><sup>ψ</sup> lie in *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (R) and hence in S (R). Since *Cc*(R) is dense in *<sup>L</sup>*2(R) by Theorem B.30, for <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R) and <sup>ε</sup> <sup>&</sup>gt; 0 we can find ϕ ∈ *Cc*(R) such that ψ −ϕ < ε/2, and (as just shown) find *n* such that <sup>ϕ</sup> <sup>−</sup>ϕ*n* <sup>&</sup>lt; <sup>ε</sup>/2, whence <sup>ψ</sup> <sup>−</sup>ϕ*n* <sup>&</sup>lt; <sup>ε</sup>. This proves that <sup>S</sup> (R) is dense in *<sup>L</sup>*2(R).

The second inclusion is continuous by Cauchy–Schwarz, which gives

$$|\mathfrak{G}(f)| \le ||\mathfrak{G}||\_2 ||f||\_2,$$

to be combined with (5.312). It should be noted that also the second inclusion in (5.308) is indeed an injection, i.e., that ϕ(*f*) = 0 for each *f* ∈ S (R) implies ϕ = 0 in *L*2(R); this is true because S (R) is dense in *L*2(R), plus the standard fact that, in any Hilbert space *H*, if ϕ, *f* = 0 for all *f* in some dense subspace of *H*, then ϕ = 0. Finally, the fact that *L*2(R) is dense in the seemingly huge space S (R) follows from the even more remarkable fact that S (R) is dense in S (R). On top of the functions *hn* just defined, also employ a function <sup>χ</sup> <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (R) such that χ(*x*) = 1 on (−1,1), and define χ*n*(*x*) = χ(*x*/*n*), so that informally lim*n*→<sup>∞</sup> χ(*x*) = 1 (as opposed to the *hn*, which converge to a δ-function as *n* → ∞). If for any *g* ∈ S (R) and any ϕ ∈ S (R) we define *g*ϕ as the distribution that maps *f* ∈ S (R) to ϕ(*f g*), and similarly define *g* ∗ ϕ as the distribution that maps *f* to ϕ(*g* ∗ *f*), we may define a sequence of distributions ϕ*<sup>n</sup>* = *hn* ∗ (χ*n*ϕ). From the point of view of (5.308), these correspond to functions <sup>ϕ</sup>*<sup>n</sup>* <sup>∈</sup> <sup>S</sup> (R) in the sense that <sup>ϕ</sup>*n*(*f*) = *dx*ϕ*n*(*x*)*f*(*x*), where *f* ∈ S (R). Using similar analysis as above, it then follows that for any *f* ∈ S (R) we have ϕ*n*(*f*) → ϕ(*f*), so that ϕ*<sup>n</sup>* → ϕ in S (R). -

For our purposes, the point of all this is that we can define generalized derivatives of (tempered) distributions, and hence, because of (5.308), of functions in *L*2(R).

Definition 5.67. *For* <sup>ϕ</sup> <sup>∈</sup> <sup>S</sup> (R) *and m* <sup>∈</sup> <sup>N</sup>*, the m'th* generalized derivative <sup>ϕ</sup>(*m*) *is defined by*

$$
\mathfrak{g}^{(m)}(f) = (-1)^m \mathfrak{g}(f^{(m)}).\tag{5.314}
$$

The idea is that under (5.308) this is an identity if ϕ ∈ S (R) (partial integration). Like the constructions at the end of the proof of Proposition 5.66, this is a special case of a more general construction: whenever we have a continuous linear map *T* : S (R) → S (R), we obtain a dual continuous linear map *T* : S (R) → S (R) defined by *T* ϕ = ϕ ◦*T*, i.e.,

$$(T'\!\!\!\!\!\!\/)(f) = \mathfrak{q}(T(f)).\tag{5.315}$$

Sometimes a slight change in the definition (as in (5.314), or as in the Fourier transform below) is appropriate so that the restriction of *T* to S (R) coincides with *T*.

Theorem 5.68. *The momentum operator p* = −*id*/*dx is self-adjoint on the domain*

$$D(p) = \{ \Psi \in L^2(\mathbb{R}) \mid \Psi' \in L^2(\mathbb{R}) \},\tag{5.316}$$

*where the derivative* ψ *is taken in the distributional sense (i.e., letting* ψ ∈ S (R)*).*

*Proof.* We first show that *p* is symmetric, or *p* ⊆ *p*∗. This comes down to

$$
\langle \Psi', \Phi \rangle = -\langle \Psi, \Phi' \rangle,\tag{5.317}
$$

for each ψ,ϕ ∈ *D*(*p*), where both derivates are "generalized". The most elegant proof (though perhaps not the shortest) uses the Sobolev space *H*1(R), which equals *D*(*p*) as a vector space, now equipped, however, with the new inner product

$$
\langle \Psi, \Phi \rangle\_{(1)} = \langle \Psi, \Phi \rangle + \langle \Psi', \Phi' \rangle,\tag{5.318}
$$

with both inner products on the right-hand side in *L*2(R); the associated norm is

$$\left\|\Psi\right\|\_{\left(1\right)}^2 = \left\|\Psi\right\|^2 + \left\|\Psi'\right\|^2. \tag{5.319}$$

Similar to the Gelfand triple (5.308), we have dense continuous inclusions

$$\mathcal{G}'(\mathbb{R}) \subset H^1(\mathbb{R}) \subset \mathcal{G}'(\mathbb{R}),\tag{5.320}$$

with analogous proof. All we need for Theorem 5.68 is the first inclusion of the triple (5.320): for <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*1(R) we now have *hn* <sup>∗</sup> <sup>ψ</sup> <sup>∈</sup> *<sup>C</sup>*∞(R) <sup>∩</sup> *<sup>H</sup>*1(R) as well as *hn* <sup>∗</sup><sup>ψ</sup> <sup>→</sup> <sup>ψ</sup> in *<sup>H</sup>*1(R), both of which follow from the *<sup>L</sup>*2-case plus the identity

$$(h\_n \* \Psi)' = h\_n \* \Psi'.\tag{5.321}$$

Using the same cutoff function <sup>χ</sup> as in the *<sup>L</sup>*<sup>2</sup> case, we have <sup>χ</sup>*n*<sup>ψ</sup> <sup>→</sup> <sup>ψ</sup> and <sup>χ</sup> *<sup>n</sup>*ψ → 0 in *<sup>L</sup>*2(R), so that (χ*n*ψ) <sup>→</sup> <sup>ψ</sup> in *<sup>L</sup>*2(R) and hence <sup>χ</sup>*n*<sup>ψ</sup> <sup>→</sup> <sup>ψ</sup> also in *<sup>H</sup>*1(R). Furthermore, the functions <sup>ψ</sup>*<sup>n</sup>* <sup>=</sup> *hn* <sup>∗</sup>(χ*n*ψ) lie in *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (R) and hence in S (R); using the above facts we obtain <sup>ψ</sup>*<sup>n</sup>* <sup>→</sup> <sup>ψ</sup> in *<sup>H</sup>*1(R). In sum, for each <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*1(R) we can find a sequence (ψ*n*) in S (R) such that ψ*<sup>n</sup>* → ψ *and* ψ *<sup>n</sup>* <sup>→</sup> <sup>ψ</sup> in *<sup>L</sup>*2(R). Hence

$$<\langle \Psi, \Phi' \rangle = \lim\_{n} \langle \Psi\_n, \Phi' \rangle = -\lim\_{n} \langle \Psi\_n', \Phi \rangle = -\langle \Psi', \Phi' \rangle. \tag{5.322}$$

For the converse, let ψ ∈ *D*(*p*∗), so that by definition for each ϕ ∈ *D*(*p*) we have

$$
\langle p^\*\Psi, \pmb\rangle = \langle \Psi, p\Phi \rangle = -i\langle \Psi, \Phi' \rangle. \tag{5.323}
$$

Since S (R) ⊂ *D*(*p*), this is true in particular for each ϕ ∈ S (R), in which case the right-hand side equals −*i*ψ (ϕ), where the derivative is distributional. But this equals *p*∗ψ,ϕ and so the distribution −*i*ψ is given by taking the inner product with *<sup>p</sup>*∗<sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R). Hence <sup>−</sup>*i*<sup>ψ</sup> <sup>=</sup> *<sup>p</sup>*∗<sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R), and in particular <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R), so that ψ ∈ *D*(*p*). This proves that *D*(*p*∗) ⊆ *D*(*p*), and since from the first step we have the oppositie inclusion, we find *D*(*p*∗) = *D*(*p*) and *p*<sup>∗</sup> = *p*. -

For the free Hamiltonian *<sup>h</sup>*<sup>0</sup> <sup>=</sup> <sup>−</sup><sup>Δ</sup> with <sup>Δ</sup> <sup>=</sup> *<sup>d</sup>*2/*dx*2, we similarly have:

Theorem 5.69. *The free Hamiltonian h*<sup>0</sup> = −Δ *is self-adjoint on the domain*

$$D(\Delta) = \{ \Psi \in L^2(\mathbb{R}) \mid \Psi'' \in L^2(\mathbb{R}) \},\tag{5.324}$$

*where the double derivative* ψ *is taken in the distributional sense.*

Although this may be proved in an analogous way, such proofs are increasingly burdensome if the number of derivatives gets higher. It is easier to use the Fourier transform (which also provided an alternative way of proving Theorem 5.68).

Theorem 5.70. *The formulae*

$$
\hat{f}(k) = \int\_{-\infty}^{\infty} \frac{d\mathbf{x}}{\sqrt{2\pi}} e^{-ik\mathbf{x}} f(\mathbf{x});\tag{5.325}
$$

$$\check{f}(\mathbf{x}) = \int\_{-\infty}^{\infty} \frac{dk}{\sqrt{2\pi}} e^{ik\mathbf{x}} f(k),\tag{5.326}$$

*are rigorously defined on* S (R)*, L*2(R)*, and* S (R)*, and provide continuous isomorphisms of each of these spaces. Furthermore,* (5.326) *is inverse to* (5.325)*, i.e.*

$$
\hat{\vec{f}} = \hat{\vec{f}} = f,\tag{5.327}
$$

*so that we may (and often do) write* ˆ*f* = F(*f*) *and* ˇ*f* = F−1(*f*)*, or f* = F−1( ˆ*f*)*. In all three cases we have the identities (in a distributional sense if appropriate)*

$$\mathcal{F}(\mathbf{x}^n f^{(m)})(k) = (\mathrm{id}/d\boldsymbol{k})^n (\mathrm{ik})^m \mathcal{F}(f)(k). \tag{5.328}$$

*Finally, as a map* <sup>F</sup> : *<sup>L</sup>*2(R) <sup>→</sup> *<sup>L</sup>*2(R) *the Fourier transform is unitary, so that*

$$
\langle \hat{\Psi}, \hat{\Phi} \rangle = \langle \Psi, \Phi \rangle. \tag{5.329}
$$

See §C.15 for further discussion. For example, we have

$$D(p) = \{ \Psi \in L^2(\mathbb{R}) \mid k \cdot \hat{\Psi}(k) \in L^2(\mathbb{R}) \};\tag{5.330}$$

$$D(\Delta) = \{ \Psi \in L^2(\mathbb{R}) \mid k^2 \cdot \hat{\Psi}(k) \in L^2(\mathbb{R}) \}. \tag{5.331}$$

Thus we may now reformulate Theorems 5.68 and 5.69 as follows:

Theorem 5.71. *The momentum operator p is self-adjoint on the domain* (5.330)*. The free Hamiltonian h*<sup>0</sup> = −Δ *is self-adjoint on the domain* (5.331)*.*

*Proof.* Denoting multiplication by *x<sup>n</sup>* by the symbol *kn*, we have

$$p = \mathcal{F}^{-1}k\mathcal{F};\tag{5.332}$$

$$
\Delta = -\mathcal{F}^{-1}k^2\mathcal{F}.\tag{5.333}
$$

Hence the theorem follows from Proposition B.73 and unitarity of the Fourier transform F (plus the little observation that if *a* = *a*<sup>∗</sup> on *D*(*a*) ⊂ *H* and *u* : *H* → *K* is unitary, then *b* = *uau*<sup>∗</sup> is self-adjoint on *D*(*b*) = *uD*(*a*) ⊂ *K*). -

Much is known about regularity properties of functions in such domains, e.g.,

$$D(p) \subset \mathcal{C}\_0(\mathbb{R});\tag{5.334}$$

$$D(\varDelta) \subset \mathcal{C}\_0^{(1)}(\mathbb{R}).\tag{5.335}$$

These are the most elementary cases of the famous *Sobolev Embedding Theorem*.

If <sup>ψ</sup> <sup>∈</sup> *<sup>D</sup>*(*p*), then *<sup>k</sup>* → (1+*k*2)1/2ψˆ(*k*) is in *<sup>L</sup>*2(R), so applying Holder's inequal- ¨ ity (B.15) with *p* = *q* = 2 to *f*(*k*)=(1+*k*2)1/2ψˆ(*k*) and *g*(*k*)=(1+*k*2)−1/2, which is in *<sup>L</sup>*2(R), too, gives <sup>ψ</sup><sup>ˆ</sup> <sup>∈</sup> *<sup>L</sup>*1(R). The Riemann–Lebesgue Lemma (see §C.15) then yields <sup>ψ</sup> <sup>∈</sup> *<sup>C</sup>*0(R). To prove (5.335), one uses (1+*k*2) rather than its square root.

Finally, we give a common domain of essential self-adjointness for *q*, *p*, and *h*0.

#### Proposition 5.72. *The operators q, p, and h*<sup>0</sup> *are essentially self-adjoint on* S (R)*.*

*Proof.* We see from (5.332) that the cases of *p* and *q* are similar, so we only explain the case of *q*. Denoting the operator of multiplication by *x* on the domain S (R) by *q*0, as in the proof of Proposition B.73 it is easy to see that *D*(*q*<sup>∗</sup> <sup>0</sup>) = *D*(*q*). Fouriertransforming, the fact that S (R) is dense in *H*1(R) (cf. the proof of Theorem 5.68) shows that *D*(*q*− <sup>0</sup> ) = *D*(*q*),so that *D*(*q*<sup>∗</sup> <sup>0</sup>) = *D*(*q*<sup>−</sup> <sup>0</sup> ). The actions of *q*<sup>∗</sup> <sup>0</sup> and *q*<sup>−</sup> <sup>0</sup> obviously being given by multiplication by *x* in both cases, we have *q*∗ <sup>0</sup> = *q*<sup>−</sup> 0 .

The proof for *h*<sup>0</sup> is similar; in the second step we now use the fact that S (R) is dense in *H*2(R), defined as *D*(Δ), as in (5.324), but now seen as a Hilbert space in the inner product ψ,ϕ(2) = ψ,ϕ+ψ,ϕ, with corresponding norm given by ψ<sup>2</sup> (2) <sup>=</sup> ψ<sup>2</sup> <sup>+</sup>ψ2. This is proved just as in the case of a single derivative. -

We also say that S (R) is a *core* for the operators in question. For example, the canonical commutation relations [*q*, *p*] = *ih*¯ · 1*<sup>H</sup>* rigorously hold on this domain.

#### 5.12 Stone's Theorem

We now come to a central result on symmetries in quantum mechanics "explaining" the Hamiltonian. Recall that a continuous unitary representation of R (as an additive group) on a Hilbert space *H* is a map *t* → *ut*, where *t* ∈ R and each *ut* ∈ *B*(*H*) is unitary, such that the associated map R×*H* → *H*, (*t*,ψ) → *ut*ψ, is continuous, and

$$
\mu\_s \mu\_l = \mu\_{s+l}, \text{ s.t.} \in \mathbb{R}; \tag{5.336}
$$

$$
\mu\_0 = 1\_H;\tag{5.337}
$$

$$\lim\_{t \to 0} u\_t \Psi = \Psi \text{ (}t \in \mathbb{R} \text{, } \Psi \in H\text{)}.\tag{5.338}$$

These conditions imply

$$\lim\_{t \to s} u\_t \Psi = u\_s \Psi \text{ (s, } t \in \mathbb{R}, \,\forall \,\, t \in H). \tag{5.339}$$

Note that according to Proposition 5.36 continuity may be replaced by weak measurability. Probably the simplest nontrivial example is given by *H* = *L*2(R) and

$$
\mu\_t \Psi(\mathbf{x}) = \Psi(\mathbf{x} - \mathbf{t}).\tag{5.340}
$$

To prove (5.338), we use a routine ε/3 argument. We first prove (5.338) for ψ ∈ *Cc*(R), where it is elementary in the sup-norm, i.e., lim*t*→<sup>0</sup> *ut*ψ − ψ<sup>∞</sup> = 0 by continuity and hence (given compact support) uniform continuity of ψ. But then the (ugly) estimate ψ<sup>2</sup> <sup>2</sup> ≤ |*K*|ψ∞, where *K* ⊂ R is any compact set containing the support of ψ, also yields lim*t*→<sup>0</sup> *ut*ψ −ψ<sup>2</sup> = 0. Hence for ε > 0 we may find δ > 0 such that *ut*ψ −ψ<sup>2</sup> < ε/3 whenever |*t*| < δ. For general ψ ∈ *H*, we find ψ ∈ *Cc*(R) such that ψ −ψ < ε/3, and, using unitarity of *ut*, estimate

$$\begin{aligned} ||\boldsymbol{\mu}\_{l}\boldsymbol{\Psi}^{\prime} - \boldsymbol{\Psi}^{\prime}|| &\leq ||\boldsymbol{\mu}\_{l}\boldsymbol{\Psi}^{\prime} - \boldsymbol{\mu}\_{l}\boldsymbol{\Psi}|| + ||\boldsymbol{\mu}\_{l}\boldsymbol{\Psi} - \boldsymbol{\Psi}|| + ||\boldsymbol{\Psi} - \boldsymbol{\Psi}^{\prime}|| \\ &\leq \boldsymbol{\varepsilon}/3 + \boldsymbol{\varepsilon}/3 + \boldsymbol{\varepsilon}/3 = \boldsymbol{\varepsilon}. \end{aligned}$$

In the context of quantum mechanics, physicists formally write

$$u\_l = e^{-ita},\tag{5.341}$$

where *a* is usually thought of as the Hamiltonian of the system, although in the previous example it is rather the momentum operator. In any case, we avoid the notation *h* instead of *a* here, partly in order to rightly suggest far greater generality of the construction and partly to avoid confusion with the notation in §B.21; if *h is* the Hamiltonian, one would have *a* = *h*/*h*¯ in (5.341). Mathematically speaking, if *a* is self-adjoint, eq. (5.341) is rigorously defined by Theorem B.158, where

$$e\_l(\mathbf{x}) = \exp(-i t \mathbf{x}).\tag{5.342}$$

Conversely, given a continuous unitary representation *t* → *ut* of R on *H*, one may attempt to define an operator *a* by specifying its domain and action by

$$D(a) = \left\{ \Psi \in H \mid \lim\_{s \to 0} \frac{u\_s - 1}{s} \Psi \text{ exists} \right\};\tag{5.343}$$

$$a\Psi = i \lim\_{s \to 0} \frac{u\_s - 1}{s} \Psi \text{ (}\Psi \in D(a)\text{)}.\tag{5.344}$$

*Stone's Theorem* makes this rigorous, and even turns the passage from the generator *a* to the unitary group *t* → *ut* (and back) into a bijective correspondence.


*Proof.* We use the setting of §B.21, so that *b* is the bounded transform of *a*.

1. Eqs. (5.336) - (5.337) are immediate from Theorem B.158, which also yields unitarity of each operator *ut*. To prove (5.338) we first take ϕ ∈ *C*<sup>∗</sup> *<sup>c</sup>* (*b*)*H*, which means that ϕ is a finite linear combinations of vectors of the type ϕ = *h*(*a*)ψ, where *h* ∈ *Cc*(σ(*a*)) and ψ ∈ *H*. Using (5.342) and (B.573), we have

$$\|\|u\_{l}\Phi - \Phi\|\| \le \|e\_{l}h - h\|\_{\infty} \|\|\Psi\|\| \le \|h\|\_{\infty} \|e\_{l} - \mathbf{1}\_{K}\|\_{\infty}^{(K)} \|\|\Psi\|\|,\tag{5.345}$$

where *K* is the (compact) support of *h* in σ(*b*). Since the exponential function is uniformly convergent on any compact set, this gives lim*t*→<sup>0</sup> *ut*ϕ − ϕ = 0. Taking finite linear combinations of such vectors ϕ gives the same result for any ϕ ∈ *C*<sup>∗</sup> *<sup>c</sup>* (*b*)*H* (with an extra step this could have been done on *C*<sup>∗</sup> <sup>0</sup> (*b*)*H*, too). Thus for ε > 0 we can find δ > 0 so that *ut*ϕ −ϕ < ε/3 whenever |*t*| < δ. For general ψ ∈ *H*, we find ϕ ∈ *C*<sup>∗</sup> <sup>0</sup> (*b*)*H* such that ϕ −ψ < ε/3, and estimate

$$\begin{aligned} ||\mu\_l \Psi' - \Psi'|| &\le ||\mu\_l \Psi' - \mu\_l \Phi|| + ||\mu\_l \Phi - \Phi|| + ||\Phi - \Psi'||\\ &\le \mathfrak{e}/\mathfrak{z} + \mathfrak{e}/\mathfrak{z} + \mathfrak{e}/\mathfrak{z} = \mathfrak{e}, \end{aligned}$$

since *ut*ψ −*ut*ϕ = ψ −ϕ by unitarity of *ut*. This is equivalent to (5.338). 2. For any ψ ∈ *H* and *n* ∈ N, define ψ*<sup>n</sup>* ∈ *H* by

$$
\Psi \Psi\_n = n \int\_0^\infty ds \, e^{-ns} u\_s \Psi,\tag{5.346}
$$

either as a Riemann-type integral (whose approximants converge in norm) or as a functional ϕ → *n* ∞ <sup>0</sup> *dse*−*nsus*ψ,ϕ, which is obviously continuous and hence is represented by a unique vector ψ*<sup>n</sup>* ∈ *H*. Then simple computations show that

$$\lim\_{s \to 0} \frac{u\_s - 1}{s} \psi\_n = n(\psi\_n - \psi),$$

so that ψ*<sup>n</sup>* ∈ *D*(*a*). The proof that ψ*<sup>n</sup>* → ψ starts with the elementary estimate

$$||\Psi\_n - \Psi|| \le n \int\_0^\infty ds \, e^{-ns} ||u\_s \Psi - \Psi||,$$

in which we split up the <sup>∞</sup> <sup>0</sup> as <sup>δ</sup> <sup>0</sup> ··· <sup>+</sup> <sup>∞</sup> <sup>δ</sup> ···, where δ > 0. Using strong continuity of the map *t* → *ut*, i.e., (5.338), for any *n* the first integral vanishes as δ → 0. In the second integral we estimate *us*ψ −ψ ≤ 2ψ and take the limit *n* → ∞. Thus ψ*<sup>n</sup>* → ψ, so that *D*(*a*) is dense in *H*.

To prove self-adjointness of *a*, we need a tiny variation on Theorem B.93:

Lemma 5.74. *Let a be symmetric. Then a is self-adjoint (i.e. a*∗ = *a) iff*

$$\text{ran}(a+i) = \text{ran}(a-i) = H.\tag{5.347}$$

*Proof.* We only need the implication from (5.347) to *a*∗ = *a* (but the converse immediately follows from Theorem B.93). So assume (5.347). For given ψ ∈ *D*(*a*∗) there must then be a ϕ ∈ *H* such that (*a*<sup>∗</sup> −*i*)ψ = (*a*−*i*)ϕ. Since *a* is symmetric, we have *D*(*a*) ⊂ *D*(*a*∗), so ψ − ϕ ∈ *D*(*a*∗), and (*a*<sup>∗</sup> − *i*)(ψ − ϕ) = 0. But ker(*a*<sup>∗</sup> − *i*) = ran(*a* + *i*)⊥, so ker(*a*<sup>∗</sup> − *i*) = 0. Hence ψ = ϕ, and in particular ψ ∈ *D*(*a*) and hence *D*(*a*∗) ⊂ *D*(*a*). Since we already know the opposite inclusion, we have *D*(*a*∗) = *D*(*a*). Given symmetry, this implies *a*<sup>∗</sup> = *a*. -

Continuing the proof of Theorem 5.73.2, symmetry of *a* easily follows from its definition, combined with the property *u*∗ *<sup>t</sup>* = *u*−<sup>1</sup> *<sup>t</sup>* = *u*−*t*. Indeed, for ψ,ϕ ∈ *D*(*a*), the weak limit *s* → 0 below exists by definition of *D*(*a*), cf. (5.343), whence:

$$<\langle \boldsymbol{\upvarphi}, a\boldsymbol{\upvarphi} \rangle = i \lim\_{s \to 0} \langle \boldsymbol{\upvarphi}, \frac{\mu\_s - 1}{s} \boldsymbol{\upvarphi} \rangle = -i \lim\_{s \to 0} \langle \frac{\mu\_{-s} - 1}{-s} \boldsymbol{\upvarphi}, \boldsymbol{\upvarphi} \rangle = \langle a\boldsymbol{\upvarphi}, \boldsymbol{\upvarphi} \rangle.$$

To prove that ran(*a*−*i*) = *H*, we compute (*a*−*i*)ψ<sup>1</sup> = −*i*ψ, with ψ<sup>1</sup> defined by (5.346) with *n* = 1. The property ran(−*i*) = *H* is proved in a similar way: now define ψ˜1 = <sup>0</sup> <sup>−</sup><sup>∞</sup> *dse<sup>s</sup> us*ψ and obtain (*a*+*i*)ψ˜1 = *i*ψ. Thus Lemma 5.74 applies. 3. Bijectivity has two directions: *a* → *ut* → *a* and *ut* → *a* → *ut*.

• Given *a* and hence (5.341) defining *ut*, we change notation from *a* to *a* in (5.343) - (5.344) and need to show that *a* = *a*. Denoting the restriction of *a* to the domain *C*∗ *<sup>c</sup>* (*b*) by *a*0, we first show that *a*<sup>0</sup> ⊆ *a* . The technique to prove this is similar to the argument around (5.345). We initially assume that ϕ ∈ *D*(*a*0) = *C*<sup>∗</sup> *<sup>c</sup>* (*b*)*H* takes the form ϕ = *h*(*a*)ψ for some *h* ∈ *Cc*(σ(*a*)) and ψ ∈ *H*. Just a trifle more complicated than (5.345), using (5.342), (B.573), and unitarity of *ut*, we estimate:

$$\begin{aligned} \left\| \frac{\mu\_{t+s}\mathfrak{g} - \mu\_t\mathfrak{g}}{s} + ia\_0\mu\_t\mathfrak{g} \right\| &\leq \left\| \frac{e\_s h - h}{s} + i \operatorname{id}\_{\sigma(T)} h \right\|\_{\infty} \left\| \left\| \mathfrak{y} \right\| \right\| \\\ &\leq \left\| \frac{e\_s - \mathbf{1}\_K}{s} + i \operatorname{id}\_K \right\|\_{\infty}^{(K)} \left\| |h| \right\|\_{\infty} \left\| \left\| \mathfrak{y} \right\| \right\|, \end{aligned}$$

so that by definition of the (strong) derivative we obtain

$$\frac{du\_l}{dt}\boldsymbol{\varphi} = \lim\_{s \to 0} \frac{u\_{l+s}\boldsymbol{\varphi} - u\_l\boldsymbol{\varphi}}{s} = -iau\_l\boldsymbol{\varphi},\tag{5.348}$$

initially for any ϕ of the said form *h*(*a*)ψ, and hence, taking finite sums, for any ϕ ∈ *D*(*a*0). The existence of this limit shows that, on the assumption ψ ∈ *D*(*a*0), we have ψ ∈ *D*(*a* ), and we also see that *a* = *a* on *D*(*a*0), or, in other words, that *a*<sup>0</sup> ⊆ *a* . Since *a* is self-adjoint (by part 2 of the theorem) and hence closed, we have *a*− <sup>0</sup> ⊆ *a* . Since *a*<sup>0</sup> is essentially self-adjoint by Theorem B.159, this gives *a* ⊆ *a* . Taking adjoints reverses the inclusion, and since both operators are self-adjoint this gives *a* = *a* .

• Given *ut* and hence (5.343) - (5.344) defining *a*, we change notation from *ut* to *u <sup>t</sup>* in (5.341) and need to show that *u <sup>t</sup>* = *ut*. Indeed, let

$$
\Psi\_l = \mu\_l \Psi,\tag{5.349}
$$

and similarly ψ *<sup>t</sup>* = *u <sup>t</sup>*ψ. If ψ ∈ *D*(*a*), then by definition of *a* we have

$$i\frac{d\,\Psi\_l}{dt} = i\lim\_{s\to 0} \frac{u\_{l+s} - u\_l}{s}\Psi = i\lim\_{s\to 0} \frac{u\_s - 1\_H}{s}u\_l\Psi = a\Psi\_l,\tag{5.350}$$

which also shows that ψ*<sup>t</sup>* ∈ *D*(*a*). Similarly, *id*ψ *<sup>t</sup>* /*dt* = *a*ψ *<sup>t</sup>* , so that ψ*<sup>t</sup>* and ψ *t* satisfy the same differential equation with the same initial condition

$$
\Psi^{(0)} = (\Psi^{(0)})' = \Psi.
$$

Now consider ψˆ*<sup>t</sup>* = ψ*<sup>t</sup>* −ψ *<sup>t</sup>* , which once again satisfies the same equation (i.e., *id*ψˆ*t*/*dt* <sup>=</sup> *<sup>a</sup>*ψˆ*t*), but this time with initial condition <sup>ψ</sup>ˆ0 <sup>=</sup> <sup>ψ</sup>(0) <sup>−</sup> (ψ(0) ) = ψ − ψ = 0. The key point is that *any* solution ψˆ*<sup>t</sup>* of this equation has the property ψˆ*t* = ψˆ0 for any *t* ∈ R, since by symmetry of *a*,

$$\frac{d}{dt} \|\Psi\_l\|^2 = \frac{d}{dt} \langle \Psi\_l, \Psi\_l \rangle = -i(\langle \Psi\_l, a\Psi\_l \rangle - \langle a\Psi\_l, \Psi\_l \rangle) = 0.$$

For our *specific* ψˆ*<sup>t</sup>* we have ψˆ0 = 0 and hence ψ*<sup>t</sup>* = ψ *<sup>t</sup>* , that is, *u <sup>t</sup>* = *ut*. -

Corollary 5.75. *With t* → *ut and a defined and related as in Theorem 5.73, if* ψ ∈ *D*(*a*)*, for each t* ∈ R *the vector* ψ*<sup>t</sup> defined by* (5.349) *lies in D*(*a*) *and satisfies*

$$a\Psi\_l = i\frac{d\Psi\_l}{dt},\tag{5.351}$$

*whence t* → <sup>ψ</sup>*<sup>t</sup> is the unique solution of* (5.351) *with initial value* <sup>ψ</sup>(0) <sup>=</sup> <sup>ψ</sup>*.*

This follows from the proof of part 3 of Theorem 5.73. With *a* = *h*/*h*¯ (as above), this is just the famous *time-dependent Schrodinger equation ¨*

$$
\hbar \hbar \Psi\_l = i\hbar \frac{d\Psi\_l}{dt}.\tag{5.352}
$$

Notes 187

#### Notes

## §5.1. Six basic mathematical structures of quantum mechanics

Wigner's Theorem was first stated by von Neumann and Wigner (1928), but the first proof appeared in Wigner (1931). See Bonolis (2004) and Scholz (2006) for some history. Instead of working with P1(*H*) with the bilinear trace form expressing the transition probabilities, one may also formulate and prove Wigner's Theorem in terms of the projective Hilbert space P*H* equipped with the Fubini–Study metric, in which case the relevant symmetries may be defined geometrically as isometries. See Freed (2012) for this proof, as well as Brody & Hughston (2001) for the underlying geometry. Kadison's Theorem may be traced back from Kadison (1965). See also Moretti (2013). Ludwig symmetries go back to Ludwig (1983); see also Kraus (1983). Our approach to von Neumann symmetries was inspired by Hamhalter (2004), and has a large pedigree in quantum logic. Bohr symmetries were introduced in Landsman & Lindenhovius (2016), where Theorem 5.4.6 was also proved.

## §5.2. The case *<sup>H</sup>* <sup>=</sup> <sup>C</sup><sup>2</sup>

This material is partly based on Simon (1976). The covering map (5.46) has a nice geometric description: if Σ = C ∪ {∞} is the Riemann sphere, we have the well-known stereographic projection

$$\mathbb{S}^2 \xrightarrow{\cong} \Sigma; \tag{5.353}$$

$$f(\mathbf{x}, \mathbf{y}, \mathbf{z}) \mapsto \frac{\mathbf{x} + i\mathbf{y}}{1 - \mathbf{z}}.\tag{5.354}$$

If *u* ∈ *SU*(2) is given by (5.43), then the associated Mobius transformation ¨

$$z \mapsto \frac{\alpha z + \beta}{-\overline{\beta}z + \overline{\alpha}}$$

is a bijection of Σ, whose associated transformation of *S*<sup>2</sup> is the rotation *R* = π˜(*u*).

#### §5.3. Equivalence between the six symmetry theorems

Most proofs may be also found in Cassinelli et al (2004) or Moretti (2013).

#### §5.4. Proof of Jordan's Theorem

Our proof of Jordan's Theorem is taken from Bratteli & Robinson (1987); see also Thomsen (1982) for a simplification of the purely algebraic step (which we delegated to Theorem C.175), originally proved by Jacobson & Rickart (1950).

#### §5.5. Proof of Wigner's Theorem

There are many proofs of Wigner's Theorem, none of them really satisfactory (in this respect the situation is similar to Gleason's Theorem). Our proof follows Simon (1976), who in turn relies on Bargmann (1964) and Hunziker (1972). The proof in Cassinelli et al (2004) seems cleaner, but their proof of the additivity of their operator *T*<sup>ω</sup> is not easy to follow. For a geometric approach see Freed (2012).

If dim(*H*) ≥ 3, the conclusion of Wigner's Theorem follows if W merely preserves orthogonality (Uhlhorn, 1963). See also Cassinelli et al (2004). This, in turn, has been generalized in various directions, e.g. to indefinite inner product spaces (Molnar, 2002) as well as to certain Banach spaces, where one says that ´ *x* is orthogonal to *y* if for all λ ∈ C one has *x*+λ*y*≥*x* (Blanco & Turnsek, 2006). ˇ

#### §5.6. Some abstract representation theory

Among numerous books on representation theory, our personal favourite is Barut & Rac¸ka (1977), and also Gaal (1973) and Kirillov (1976) are classics at least for the abstract theory. An interesting recent paper on the unitary group on infinitedimensional Hilbert space is Schottenloher (2013).

## §5.7. Representations of Lie groups and Lie algebras

This section was inspired by Hall (2013) and Knapp (1988). For Lie's Third Theorem, see, for example, Duistermaat & Kolk (2000), §1.14. To obtain Theorem 5.41, consider the canonical projection <sup>π</sup>˜ : *<sup>G</sup>*˜ <sup>→</sup> *<sup>G</sup>* and define *<sup>D</sup>* <sup>=</sup> <sup>π</sup>˜−1({*e*}). This is a discrete normal subgroup of *G*˜, and it is an easy fact that a discrete normal subgroup of any connected topological group must lie in its center. Note that a discrete subgroup of the center of *G*˜ is automatically normal.

The exponentiation problem for skew-adjoint representations of g is considerably more complicated than in finite dimension. Let *H* be an infinite-dimensional Hilbert space with dense subspace *<sup>D</sup>* and let <sup>ρ</sup> : g <sup>→</sup> *<sup>L</sup>*(*D*,*H*) be a linear map, where *L*(*D*,*H*) is the space of linear maps from *L* to *H*. We say that ρ is a *skew-adjoint representation* of g if *(i): D* is invariant under *<sup>u</sup>* (g), *(ii):* the commutation relations (5.157) hold on *D*, and *(i):* each *i*ρ(*A*) is essentially self-adjoint on *D*. For example, we have seen that if *u* : *G* → *U*(*H*) is a unitary representation, then the construction ρ(*A*) = *u* (*A*), defined on the Garding domain ˚ *D* = *DG*, fits the bill. Conversely, additional conditions are needed for ρ to exponentiate to a unitary representation. The best-known of those is *Nelson's criterion*: if, given a skew-adjoint representation <sup>ρ</sup> : g <sup>→</sup> *<sup>L</sup>*(*D*,*H*), the *Nelson operator* or *Laplacian* <sup>Δ</sup> <sup>=</sup> <sup>∑</sup>dim(g) *<sup>k</sup>*=<sup>1</sup> <sup>ρ</sup>(*Tk*)<sup>2</sup> is essentially self-adjoint on *D*, then ρ exponentiates to a unitary representation of *G*˜ (with additional remarks similar to those in Corollary 5.43).

#### §5.8. Irreducible representations of *SU*(2)

#### §5.9. Irreducible representations of compact Lie groups

See e.g. Knapp (1988), Simon (1996), and Deitmar (2005), and innumerable other books. This material ultimately goes back to (E.) Cartan and Weyl. ´

#### §5.10. Symmetry groups and projective representations

See Varadarajan (1985), Tuynman & Wiegerinck (1987), Landsman (1998a), Cassinelli et al (2004), and Hall (2013). For different proofs of Theorem 5.59 (Bargmann, 1954) see Simms (1971) and Cassinelli et al (2004). Leaving out the anti-unitary symmetries is a pity; see e.g. Freed & Moore and Roberts (2016).

#### §5.11. Position, momentum, and free Hamiltonian

#### §5.12. Stone's Theorem

See Reed & Simon (1972), Schmudgen (2012), Moretti (2013), Hall (2013), and ¨ many other books. Our proof of part 1 of Theorem 5.73 is original.

## Part II Between *C*0(*X*) and *B*(*H*)

## Chapter 6 Classical models of quantum mechanics

This chapter gives an introduction to a chain of results attempting to exclude deeper layers underneath quantum mechanics that restore some form of classical physics:

'[Such results] more or less illustrate the ways along which some opponents might hope to escape Bohr's reasonings and von Neumann's proof and the places where they are dangerously near breaking their necks.' (Groenewold, 1946, p. 454)

In so far as they are mathematically precise, such no-go results have their roots in von Neumann's 1932 book, which gave rise to two traditions that were often in polemical opposition to each other. Mathematically minded authors typically admired von Neumann's exclusion of hidden variables, yet tried to strengthen his theorem by weakening its assumptions; this sparked, for example, *Gleason's Theorem* (1957) as well as the *Kochen–Specker Theorem* (1967). Certain physicists (led by Bell), on the other hand, tried to circumvent (and later even ridicule) von Neumann's work. A high point of this tradition was *Bell's Theorem* from 1964, which was informed not only by von Neumann, but even more so by the famous Einstein– Podolsky–Rosen (EPR) paper from 1935, as well as by Bohm's deterministic pilot wave reformulation of quantum mechanics (1952). However, at the end of the day these traditions turned out to be not really divergent after all: Bell not only independently (and earlier) obtained a version of the Kochen–Specker Theorem, but, more importantly, his results from 1964 turn out to be very closely related to the culmination of the first tradition in the form of the so-called *Free Will Theorem* (FWT), which was published by Conway and Kochen during 2006–2008. Indeed, although its validity is uncontroversial, this theorem has been criticized on the following grounds:


One aim of this chapter is to clarify these matters, with the following conclusions:

	- a. Bell's arguments rely on probability theory (whereas the FWT does not).
	- b. The (optical) corner of quantum mechanics used in Bell's Theorem may be replaced by the corresponding experimental results, whereas the FWT uses uncontroversial yet untested predictions about massive spin-1 particles.
	- c. The FWT must assume perfect (EPR) correlations, which are difficult to realize and hence are avoided by later versions of Bell's Theorem (i.e. through the CHSH inequalities rather than the original Bell inequalities).
	- d. Like EPR, Bell and his followers focused on locality right from the beginning, and hence in Bell (1964) the inference is from locality to determinism. Conway and Kochen, on the other hand, resolve the contradiction their FWT established by inferring randomness of outcomes from freedom of settings.

We start with a very simple treatment of both von Neumann's argument against linear hidden variables and Kochen & Specker's refinement of it, in which von Neumann's controversial linearity assumption is decisively weakened so as to only apply to *commuting* operators; the Kochen–Specker Theorem excludes what are called *non-contextual quasi-linear hidden variables*. We then present what we see as a more transparent version of the FWT, whose key ingredient of replacing the noncontextuality assumption in the Kochen–Specker Theorem by a locality condition is preserved, but where this time the setting is completely deterministic. Freedom of choice then arises as a very natural independence assumption, and any threat of circularity is avoided: the conclusion is simply a contradiction between determinism, freedom of choice (i.e. of apparatus settings), locality, and quantum mechanics. Moreover, as we argue in §6.3, the philosophically precise concept of free will used in the assumptions of the FWT is what Lewis coined 'local miracle compatibilism'.

Following an interlude on the GHZ Theorem, which seamlessly fits into the given framework, we then turn to Bell's Theorems, which we compare with the FWT.

Finally, we give our own rigorous version of an argument first proposed by Colbeck and Renner to the effect that, under suitable freeness of choice and no-signaling conditions (similar to those in Bell's Theorem and the FWT), as long as they are compatible with quantum mechanics, hidden variables are at best irrelevant. In fact, this can only be proved under much stronger assumptions, obscuring the claim.

#### 6.1 From von Neumann to Kochen–Specker

Von Neumann's Theorem 6.2 below was the first technical result excluding some class of hidden variables underneath quantum mechanics, namely (in current parlance) *linear non-contextual hidden variables*. This terminology requires some explanation. First, theorems of this kind apparently accept the mathematical structure of the observables prescribed by the usual formalism of quantum theory, i.e., observables are identified with elements of the self-adjoint part

$$H\_n(\mathbb{C}) \equiv M\_n(\mathbb{C})\_\text{sa} = \{ a \in M\_n(\mathbb{C}) \mid a^\* = a \} \tag{6.1}$$

of the algebra *Mn*(C) of *n*×*n* matrices (this simple case suffices to make all points of conceptual interest). Short of introducing "hidden" *observables*, hidden variable theories propose the existence of hidden *states*, which either replace or supplement the usual quantum states (which in the case at hand would be density operators). Mimicking classical (statistical) physics, such states are interpreted as probability measures on some phase space *X*, whose points *x* ∈ *X* assign sharp values to quantum-mechanical observables. Naively, this is done through associated functions

$$V\_{\mathbf{x}} : H\_n(\mathbb{C}) \to \mathbb{R},\tag{6.2}$$

but in fact this choice already commits us to the first of two possibilities, which we pragmatically present as theories predicting measurement outcomes:


Definition 6.1. *A* non-contextual hidden variable *is a map V* : *Hn*(C) → R *that for each a* ∈ *Hn*(C)*, and in terms of the n*×*n unit matrix* 1*n, satisfies*

$$V(a^2) = V(a)^2;\tag{6.3}$$

$$V(1\_n) = 1.\tag{6.4}$$

*That is, V is* dispersion-free *as well as* normalized*, respectively.*

Theorem 6.2. *For n* ≥ 2*, non-zero linear dispersion-free maps V* : *Hn*(C) → R *do not exist. In particular, linear non-contextual hidden variables do not exist.*

*Proof.* Such maps extend to complex-linear dispersion-free maps *V* : *Mn*(C) → C by complex linearity, so that theorem is equivalent to Proposition 2.10. -

As von Neumann perfectly well understood himself, his seemingly natural linearity assumption (given the mathematical structure of quantum mechanics unearthed by none other than he!) is unwarranted physically (and even mathematically, since eigenvalues and eigenstates, which should be the hallmark of dispersion-free states, are by no means linear in the underlying operator). This suggests the following:

Definition 6.3. *A map V* : *Hn*(C) → R *is called* quasi-linear *if for all s*,*t* ∈ R *and all a*,*b* ∈ *Hn*(C) that commute *(i.e., ab* = *ba) one has*

$$V(sa+tb) = sV(a) + tV(b). \tag{6.5}$$

As in the linear case, such a map uniquely extends to a map *V* : *Mn*(C) → C that is precisely a quasi-state in the sense of Definition 2.26. The following lemma will be useful, also showing that the above objections to linearity have been met.

Lemma 6.4. *Let V* : *Hn*(C) → R *be a quasi-linear non-contextual hidden variable.*


More generally, it follows from Theorem C.24 that if *H* is a Hilbert space and *V* : *B*(*H*)sa → R is a quasi-linear non-contextual hidden variable (or, equivalently, its complexification *V*<sup>C</sup> : *B*(*H*) → C is a dispersion-free quasi-state), then *V*(*a*) ∈ σ(*a*) (provided *a*∗ = *a*). This implies the above lemma, but we also provide a direct proof.

*Proof.* For any *b* ∈ *Hn*(C) with *ab* = *ba*, eq. (6.3) and quasi-linearity imply that

$$V(ab) = V(a)V(b);\tag{6.6}$$

just evaluate *<sup>V</sup>*((*<sup>a</sup>* <sup>±</sup> *<sup>b</sup>*)2)=(*V*(*a*) <sup>±</sup>*V*(*b*))2. Taking *<sup>b</sup>* <sup>=</sup> *<sup>a</sup>*<sup>2</sup> etc. and also invoking (6.4) then yields *V*(*p*(*a*)) = *p*(*V*(*a*)) for any polynomial in *a*. If λ*<sup>i</sup>* are the eigenvalues of *a*, its characteristic polynomial *p*(*a*) = ∏*<sup>n</sup> <sup>i</sup>*=1(*a*−λ*i*) satisfies *p*(*a*) = 0, so that *V*(*p*(*a*)) = 0 and hence *p*(*V*(*a*)) = 0, or ∏*<sup>n</sup> <sup>i</sup>*=1(λ −λ*i*) = 0. This implies that λ = λ*<sup>i</sup>* for some *i*. The second claim is proved in a similar way. -

#### Theorem 6.5. *For n* ≥ 3*, quasi-linear non-contextual hidden variables do not exist.*

This is the *Kochen–Specker Theorem*. It follows from Gleason's Theorem 2.28 and von Neumann's Theorem 6.2, since according to Corollary 2.29 to the former, quasistates on *Mn*(C) are actually states (in other words, quasi-linear non-contextual hidden variables are linear). However, Kochen and Specker also gave a direct proof of their theorem, subsequently somewhat simplified along the following lines.

*Proof.* We prove the claim for *n* = 3, which (by restricting *V* to any self-adjoint subalgebra of *Mn*(C) isomorphic to *H*3(C)) implies the result for all *n* > 3 also. To prove Theorem 6.5 for *n* = 3, we interpret *H*3(C) as the algebra of observables of a spin-1 particle and introduce the well-known angular momentum matrices

$$J\_1 = \begin{pmatrix} 0 \ 0 \ 0 \\ 0 \ 0 \ -i \\ 0 \ i \ 0 \end{pmatrix}, J\_2 = \begin{pmatrix} 0 \ 0 \ i \\ 0 \ 0 \ 0 \\ -i \ 0 \ 0 \end{pmatrix}, J\_3 = \begin{pmatrix} 0 \ -i \ 0 \\ i \ 0 \ 0 \\ 0 \ 0 \ 0 \end{pmatrix}. \tag{6.7}$$

In what follows, we will heavily use the squares

$$J\_1^2 = \begin{pmatrix} 0 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 1 \end{pmatrix}, J\_2^2 = \begin{pmatrix} 1 \ 0 \ 0 \\ 0 \ 0 \ 0 \\ 0 \ 0 \ 1 \end{pmatrix}, J\_3^2 = \begin{pmatrix} 1 \ 0 \ 0 \\ 0 \ 1 \ 0 \\ 0 \ 0 \ 0 \end{pmatrix},\tag{6.8}$$

each of which has eigenvalues 0 and 1. The *J*<sup>2</sup> *<sup>i</sup>* commute by inspection, and satisfy

$$J\_1^2 + J\_2^2 + J\_3^2 = 2 \cdot 1\_3. \tag{6.9}$$

The (matrix-valued) angular momentum vector is given by

$$\mathbf{J} = J\_1 \mathbf{e}\_1 + J\_2 \mathbf{e}\_2 + J\_3 \mathbf{e}\_3,\tag{6.10}$$

where (e1, e2, e3) is the standard basis of R<sup>3</sup> (seen as a vector space with the usual inner product ·,·), i.e., e<sup>1</sup> = (1,0,0), etc., and the angular momentum *J*<sup>u</sup> along an arbitrary unit vector u = ∑*<sup>i</sup> ui*e*<sup>i</sup>* in R<sup>3</sup> is given by

$$J\_{\mathbf{u}} = \langle \mathbf{J}, \mathbf{u} \rangle = \sum\_{i=1}^{3} J\_{i} u\_{i}. \tag{6.11}$$

This brings us to the crucial point: a map *<sup>V</sup>* : *<sup>H</sup>*3(C) <sup>→</sup> <sup>R</sup> induces a map *<sup>V</sup>*˜ : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> <sup>R</sup> on the set *S*<sup>2</sup> of all unit vectors u in R3, via

$$
\mathcal{V}(\mathbf{u}) = V(J\_\mathbf{u}^2). \tag{6.12}
$$

As usual, a *basis* of R3, denoted by *a* = (u1,u2,u3), is always assumed *orthonormal*.

Lemma 6.6. *Let V* : *H*3(C) → R *be a non-contextual quasi-linear hidden variable, with associated map <sup>V</sup>*˜ : *<sup>S</sup>*<sup>2</sup> → {0,1} *given by* (6.12)*. Then:*

*1. <sup>V</sup>*˜(−u) =*V*˜(u) *for each* <sup>u</sup> <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> *(so that V is defined on the real projective plane);* ˜ *2. If a* = (u1,u2,u3) *is a basis, then the triple <sup>V</sup>*˜(*a*) <sup>≡</sup> (*V*˜(u1),*V*˜(u2),*V*˜(u3)) *must contain a single 0 and two 1's, i.e., V*˜(*a*) *must be one of the triples*

$$
\begin{aligned}
\mathcal{A}^{(1)} &= (0,1,1); \\
\mathcal{A}^{(2)} &= (1,0,1); \\
\mathcal{A}^{(3)} &= (1,1,0).
\end{aligned}
\tag{6.13}
$$

In Gleason-like language, *V*˜ is a 2-valued frame function of weight *w*(*V*˜) = 2.

*Proof.* If *a* = (u1,u2,u3) is a basis, then *J*u*<sup>i</sup>* = *uJiu*<sup>∗</sup> for *i* = 1,2,3, where *u* is the 3×3 matrix with entries *ui j* = u*i*, e*j*. Since *u* is unitary, the matrices *J*u*<sup>i</sup>* and their squares have the same eigenvalues and satisfy the same relations as the *Ji* and their squares. Thus the eigenvalues of *J*<sup>2</sup> <sup>u</sup>*<sup>i</sup>* are 0 and 1, for fixed *<sup>a</sup>* the squares *<sup>J</sup>*<sup>2</sup> <sup>u</sup>*<sup>i</sup>* mutually commute, and they satisfy the sum rule (6.9), i.e., *J*<sup>2</sup> <sup>u</sup><sup>1</sup> <sup>+</sup>*J*<sup>2</sup> <sup>u</sup><sup>2</sup> <sup>+</sup>*J*<sup>2</sup> <sup>u</sup><sup>3</sup> <sup>=</sup> <sup>2</sup>·13, so *<sup>V</sup>*˜(u1)+ *V*˜(u2) +*V*˜(u3) = 2. The claim then follows from Definition 6.3 and Lemma 6.4. -

Now define a *coloring* of <sup>R</sup><sup>3</sup> as any map*V*˜ : *<sup>S</sup>*<sup>2</sup> → {0,1} satisfying the two properties in Lemma (6.6). The proof of Theorem 6.5 then reduces to the following lemma.

Lemma 6.7. *There exists no coloring of* R3*.*

*Proof.* Take the following unit vectors (some identical), grouped into 11 bases (for simplicity we use unnormalized vectors, e.g., (1,0,1) stands for (1/ √ 2,0,1/ √ 2)):


We will show that one cannot even color this particular finite set of vectors (let alone all unit vectors in R3). We denote a vector u*<sup>i</sup>* in a basis *a*<sup>μ</sup> by

$$\mathbf{u}\_{i}^{(\mu)}, i = 1, 2, 3, \mu = 1, \dots, 11, 2$$

and write e.g. *V*˜(*a*<sup>μ</sup> )=(0,1,1) for the three conditions

$$\mathcal{V}(\mathbf{u}\_1^{(\mu)}) = 0, \ \mathcal{V}(\mathbf{u}\_2^{(\mu)}) = 1), \ \mathcal{V}(\mathbf{u}\_3^{(\mu)}) = 1.$$

The main point is that if some coloring *V*˜ maps a specific vector u to 0, then all vectors orthogonal to u must go to 1. In particular, two orthogonal vectors can never both be sent to 0. To find a contradiction (to the assumption that *V*˜ exists), we try to assign values *V*˜(u(μ) *<sup>i</sup>* ) one after the other, starting in row 1. Here some specific choices will be made, but by symmetry other choices lead to similar contradictions.


$$
\tilde{V}(a\_{11}) = (1, 1, 1). \tag{6.14}
$$

But (1,1,1) is not an admissible value of *V*˜ ! So *V*˜ and hence *V* cannot exist. -

Corollary 6.8. *There is no function V with the two properties stated in Lemma 6.6.* ˜

The Kochen–Specker Theorem is often stated in the following way.

Definition 6.9. *For any finite-dimensional Hilbert space H, a* coloring *of the set* P1(*H*) *of one-dimensional projections on H is a function*

$$W: \mathcal{P}\_1(H) \to \{0, 1\}$$

*such that for any resolution of the identity* (*ei*) *with ei* ∈ P1(*H*)*, i.e.,*

$$e\_i e\_j = \delta\_{ij} e\_i;\tag{6.15}$$

$$\sum\_{i} e\_i = 1\_H,\tag{6.16}$$

*one has*

$$\sum\_{i} W(e\_i) = 1,\tag{6.17}$$

*so that there is exactly one member ei of the family such that W*(*ei*) = 1*.*

Note that if *e* ∈ P1(*H*) then *e* = *e*<sup>ψ</sup> = |ψψ| for some unit vector ψ ∈ *H*, so that each basis (υ*i*) of *H* defines such a family by *ei* = |υ*i*υ*i*|, and *vice versa*, up to phase factors. The setting of Gleason's Theorem is similar, with the crucial difference that the function on P1(*H*) in question then takes values in [0,1] instead of {0,1} and hence can be shown to exist, even amply so (as there are many states).

#### Theorem 6.10. *If* dim(*H*) > 2*, there exists no coloring of* P1(*H*)*.*

*Proof.* For *H* = C3, the existence of *W* would yield the existence of *V*˜ through

$$
\tilde{V}(\mathbf{u}) = 1 - W(e\_\mathbf{u}),
\tag{6.18}
$$

where <sup>u</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> is regarded as a vector in <sup>C</sup>3. Property 1 in Lemma 6.6 is obviously satisfied. To prove property 2, we note that for any unit vector <sup>u</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> <sup>⊂</sup> <sup>C</sup>3, we have

$$J\_\mathbf{u}^2 \mathbf{u} = \mathbf{0},\tag{6.19}$$

since an explicit computation based on (6.11) shows that, with u = (*u*1,*u*2,*u*3),

$$J^2\_\mathbf{u} = \begin{pmatrix} u\_2^2 + u\_3^2 & -u\_1 u\_2 & -u\_1 u\_3 \\ -u\_1 u\_2 & u\_1^2 + u\_3^2 & -u\_2 u\_3 \\ -u\_1 u\_3 & -u\_2 u\_3 & u\_1^2 + u\_2^2 \end{pmatrix}. \tag{6.20}$$

It follows from rotation invariance that the eigenvalues of *J*<sup>2</sup> <sup>u</sup> are the same as those of each *J*<sup>2</sup> *<sup>i</sup>* , cf. (6.8), i.e., λ = 0 with multiplicity one and λ = 1 with multiplicity two. Hence (6.19) gives the projection *e*<sup>0</sup> onto the eigenspace of *J*<sup>2</sup> <sup>u</sup> for λ = 0 as

$$e\_0 = |\mathbf{u}\rangle\langle\mathbf{u}| \equiv e\_{\mathbf{u}}.\tag{6.21}$$

Property 2 in Lemma 6.6 then follows from the assumption that *W* is a coloring. Since *V*˜ cannot exist by Lemma 6.7, neither can *W*. This proves the claim for C3.

We finish by induction. Suppose <sup>C</sup>*<sup>n</sup>* contains some set {u*k*}*k*∈*<sup>K</sup>* of unit vectors that cannot be colored, assuming that u<sup>0</sup> = (1,0,...,0) lies in this set. We embed each u*<sup>k</sup>* into C*n*+<sup>1</sup> by adding a zero *at the end*, calling the image u *<sup>k</sup>*. Adding v = (0,...,0,1), the only possible coloring of the set {u *<sup>k</sup>*,v}*k*∈*<sup>K</sup>* in <sup>C</sup>*n*+<sup>1</sup> is given by *W*(u *<sup>k</sup>*) = 0 for each *k* ∈ *K* and *W*(v) = 1. Indeed, if *W*(u *k*0 ) = 1 for some *k*0, then, since v is orthogonal to each u *<sup>k</sup>*, we must have *W*(v) = 0, which means that the original set {u*k*}*k*∈*<sup>K</sup>* should be colorable in <sup>C</sup>*n*, but this is impossible by assumption.

We now embed each u*<sup>k</sup>* into C*n*+<sup>1</sup> by adding a zero *at the beginning*, denoting its image by u *<sup>k</sup>* , and add u <sup>0</sup> = (1,0,...,0,0). By the same token, the only coloring of the set {u *<sup>k</sup>* ,u <sup>0</sup>}*k*∈*<sup>K</sup>* is given by *W*(u *<sup>k</sup>* ) = 0 for each *k* ∈ *K* and *W*(u <sup>0</sup>) = 1. But this leaves the set {u *k*,u *<sup>k</sup>* ,v}*k*∈*<sup>K</sup>* in <sup>C</sup>*n*+<sup>1</sup> uncolorable, since colorability of {<sup>u</sup> *<sup>k</sup>*,v}*k*∈*<sup>K</sup>* gave *W*(u <sup>0</sup>) = 0, whereas colorability of {u *<sup>k</sup>* ,u <sup>0</sup>}*k*∈*<sup>K</sup>* gave *W*(u <sup>0</sup>) = 1. -

The set thus obtained is larger than necessary. For example, already for *H* = C<sup>4</sup> the following bases cannot be colored (again writing down unnormalized vectors):


The proof is the following observation: if we present the coloring condition as

$$W(0,0,0,1) + W(0,0,1,0) + W(1,1,0,0) + W(1,-1,0,0) = 1; \quad (a\_1)$$

$$\cdots \tag{a\_{\bullet}}$$

$$W(1,1,1,-1) + W(-1,1,1,1) + W(1,0,0,1) + W(0,1,-1,0) = 1,\quad(a\mathfrak{g})$$

then since there are nine such equations the sum of the right-hand sides is odd, whereas the sum of the left-hand sides is even, since each vector appears twice.

To bridge the gap between the Kochen–Specker Theorem and the Free Will Theorem, as well as the one between mathematics and physics, we now rephrase the former as a "mini FWT". We build an experiment consisting of a box containing a spin-1 particle and a device capable of measuring all of the three observables

$$(J\_{\mathbf{u}\_1}^2, J\_{\mathbf{u}\_2}^2, J\_{\mathbf{u}\_3}^2)$$

for an arbitrary basis *a* of R3; since the operators in question commute, this simultaneous measurement is allowed by quantum theory. The choice of *a* is called the *setting* of the experiment, traditionally denoted by *A* (in honor of Alice, who is supposed to perform the experiment), with possible values *A* = *a*. In "phenomenological" notation, the observable measured in an experiment like this is called *F*, which in the case at hand has three components *F* = (*F*1,*F*2,*F*3): given the setting *a*, the observable *Fi* corresponds to *J*<sup>2</sup> u*i* . The notation *F* = λ for λ = (λ1,λ2,λ3), i.e., *Fi* = λ*i*, then expresses the fact that the outcome of a measurement of *F* is λ.

According to both quantum mechanics and our quasi-linear non-contextual hidden variable theory, either λ*<sup>i</sup>* = 0 or λ*<sup>i</sup>* = 1, and λ must lie in the value space

$$A = \{ (0, 1, 1), (1, 0, 1), (1, 1, 0) \};\tag{6.22}$$

cf. Lemma 6.6 for the hidden variable theory, while in quantum mechanics (6.22) follows from the fact that λ must lie in the joint spectrum of the three operators *J*<sup>2</sup> u*i* . This, in turn means that there must be a joint eigenvector ψ such that *J*<sup>2</sup> <sup>u</sup>*<sup>i</sup>* = λ*i*ψ for each *i* = 1,2,3. There are three such joint eigenvectors, namely u1, u2, and u<sup>3</sup> (initially defined as vectors in R<sup>3</sup> but now seen as vectors in C3), with joint eigenvalues (0,1,1), (1,0,1), and (1,1,0), respectively.

Otherwise, quantum mechanics and our quasi-linear non-contextual hidden variable theory provide a different picture of the experiment. According to the former theory, a given spin-1 particle may be prepared in a (pure) quantum state ψ, which is a unit vector in C3. Quantum theory then merely predicts probabilities

$$P\_{\Psi}(F=\lambda|A=a) \equiv p\_{J\_{\mathfrak{u}\_1}^2, J\_{\mathfrak{u}\_2}^2, J\_{\mathfrak{u}\_3}^2}(\lambda\_1, \lambda\_2, \lambda\_3),\tag{6.23}$$

for the possible outcomes λ, which according to the Born rule (2.21) are given by

$$P\_{\Psi}(F = \mathcal{X}^{(i)} | A = a) = |\langle \mathbf{u}\_{i}, \Psi \rangle|^{2}. \tag{6.24}$$

So if ψ = u*i*, then the outcome will be λ = λ(*i*) with probability one, but in a superposition ψ = ∑*<sup>i</sup> ci*u*<sup>i</sup>* (with ∑*<sup>i</sup>* |*ci*| <sup>2</sup> = 1), quantum theory predicts a random sequence of outcomes λ(*i*) , each with probability |*ci*| 2.

Let us note that quantum mechanics is non-contextual in the following (probabilistic) sense. Alice could decide to perform just one measurement instead of three, say *F*1, with setting *a*<sup>1</sup> = u1, or perhaps she may not know if the other two are performed. Fortunately, this does not matter, since for any unit vector <sup>ψ</sup> <sup>∈</sup> <sup>C</sup>3,

$$P\_{\Psi}(F\_1 = \lambda\_1 | A\_1 = \mathbf{u}\_1) = \sum\_{\lambda\_2, \lambda\_3} P\_{\Psi}(F = \lambda | A = a),\tag{6.25}$$

so that according to quantum mechanics, it does not matter for the Born probabilities of the first measurement if the other two are performed or not.

The question now arises if some quasi-linear non-contextual hidden variable theory theory could improve on this, in that the *probabilities* quantum theory assigns to various outcomes are replaced by *predictions*. In the sprit of determinism (whilst avoiding the appearance of circularity), such a theory should also predict the settings of the experiment. Accordingly, the assumptions leading to our "mini FWT" are:

Definition 6.11. *In the context of the experiment on spin-1 particles just discussed:*

• Determinism firstly *means that there is a state space X with associated functions*

$$A: X \to X\_{\mathcal{A}};\tag{6.26}$$

$$F: X \to \Lambda,\tag{6.27}$$

*where XA is the set of all bases in* <sup>R</sup><sup>3</sup> *(i.e. a* <sup>∈</sup> *XA), and* <sup>Λ</sup> *is some set of possible outcomes; these functions completely describe the experiment in the sense that each state x* ∈ *X determines both its settings a* = *A*(*x*) *and its outcome* λ = *F*(*x*)*. Here A* = (*A*1,*A*2,*A*3)*, where the functions Ai* : *<sup>X</sup>* <sup>→</sup> *<sup>S</sup>*<sup>2</sup> *(seen as the space of unit vectors in* <sup>R</sup>3*) combine to define a basis, and F* = (*F*1,*F*2,*F*3)*, where Fi* : *<sup>X</sup>* <sup>→</sup> <sup>R</sup>*.*

#### 6.1 From von Neumann to Kochen–Specker 201

Secondly*, there exists some set XZ and an additional function*

$$Z: X \to X\_{\mathbb{Z}},\tag{6.28}$$

*such that*

$$F = F(A, Z). \tag{6.29}$$

*More precisely, for each x* ∈ *X one has*

$$F(\mathbf{x}) = \mathcal{F}(A(\mathbf{x}), Z(\mathbf{x})) \tag{6.30}$$

*for a certain function <sup>F</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup>*XZ* <sup>→</sup> <sup>Λ</sup>*. Also this function is, of course, a triple F*ˆ = (*F*ˆ <sup>1</sup>,*F*ˆ <sup>2</sup>,*F*ˆ <sup>3</sup>)*, where F*ˆ *<sup>i</sup>* : *XA* ×*XZ* → 2*. In terms of* (6.28)*, then:*


$$A \times Z: X \to X\_A \times X\_Z$$

$$x \mapsto (A(\mathbf{x}), Z(\mathbf{x})) \tag{6.31}$$

*is surjective; in other words, for each* (*a*,*z*) ∈ *XA* ×*XZ there is an x* ∈ *X for which A*(*x*) = *a and Z*(*x*) = *z (making a and z free variables).*

• Non-contextuality *(cf. Lemma 6.6) finally stipulates that F take the form* <sup>ˆ</sup>

$$\mathcal{F}((\mathbf{u}\_1, \mathbf{u}\_2, \mathbf{u}\_3), z) = (\mathcal{F}(\mathbf{u}\_1, z), \mathcal{F}(\mathbf{u}\_2, z), \mathcal{F}(\mathbf{u}\_3, z)), \tag{6.32}$$

*for a single function <sup>F</sup>*˜ : *<sup>S</sup>*<sup>2</sup> <sup>×</sup>*XZ* <sup>→</sup> <sup>2</sup> *that also satisfies*

$$
\tilde{F}(-\mathbf{u}, z) = \tilde{F}(\mathbf{u}, z). \tag{6.33}
$$

"Nature" may be taken to be either an experimental result or an uncontroversial prediction of (some corner of) quantum mechanics. The function *Z* (including its domain *XZ*) describes anything relevant to the experiment (such as the behaviour of the particle) *except* the variables determining the settings (which do form part of *X*). The goal of the freedom assumption is to remove any potential dependencies between the variables (*a*,*z*), and hence between the physical system Alice perform her measurements *on*, and the devices she performs her measurements *with*.

Corollary 6.12. *Determinism, Nature, Freedom, and Non-contextuality are contradictory.*

*Proof.* For each *<sup>z</sup>* <sup>∈</sup> *XZ*, define a function *<sup>V</sup>*˜ *<sup>z</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> <sup>2</sup> by *<sup>V</sup>*˜ *<sup>z</sup>*(u) = *F*˜(u,*z*). The assumptions combine to give *V*˜ *<sup>z</sup>* the same properties as *V*˜ in Lemma 6.6 (where *z* "goes along for a free ride"). According to Corollary 6.8 (which applies because by *Freedom* one can freely vary *a* for any given *z*), the function *V*˜ *<sup>z</sup>* cannot exist. -

This "mini FWT" is a good exercise for the Free Will Theorem in the next section. For example, let us note, as a warning, that if Determinism is seen as the culprit (and hence falls), then the other assumptions in the (min) FWT are no longer defined. This blocks a direct inference from Freedom to Indeterminism a la Conway & Kochen. `

#### 6.2 The Free Will Theorem

The Free Will Theorem is similar in spirit to Corollary 6.12, with the difference that the experiment now has two wings and the *non-contextuality* assumption is replaced by a certain *locality* condition. This condition relates to the setting introduced by Einstein, Podolsky, and Rosen in 1935 and further studied by Bohm, Bell, and others, in which (in current jargon) two physicists, called Alice and Bob, are far apart whilst performing simultaneous experiments on some correlated two-particle state (technically speaking, their measurements need to be *spacelike separated*). In the situation considered by EPR each particle had a spatial degree of freedom and hence required the infinite-dimensional Hilbert space *L*2(R3) for its description, but, as recognized by Bohm, the thrust of the argument comes out more clearly if each particle merely has an internal degree of freedom (and is "frozen" otherwise).

Bell (1964) considered a pair of spin <sup>1</sup> <sup>2</sup> particles (cf. §6.5), each of which has Hilbert space C<sup>2</sup> (although the famous experiments of Aspect testing the violation of Bell's inequalities used photons, which have the "same" Hilbert space), but because of its reliance on the Kochen–Specker Theorem (which fails for C2) the Free Will Theorem requires one dimension more, i.e., *H* = C3. As before, we see this as the state space of a massive spin-1 particle. The price of this extra dimension is that the pertinent experiment whose outcome provides the *Nature* input for the Free Will Theorem has not actually been performed, but, as in the Bell case, the predictions of quantum mechanics are uncontroversial and will serve as input instead.

These predictions are as follows. Alice and Bob measure on the correlated state

$$
\Psi\_0 = (\mathbf{e}\_1 \otimes \mathbf{e}\_1 + \mathbf{e}\_2 \otimes \mathbf{e}\_2 + \mathbf{e}\_3 \otimes \mathbf{e}\_3) / \sqrt{3},\tag{6.34}
$$

where we recall that (e1, e2, e3) is the standard basis of R3, now seen as a basis of C3. This state is rotation-invariant, which means that nonzero angular momentum in one particle must be compensated for in the other, creating the desired correlations.

As before, we denote Alice's setting by *A* = *a*, which remains the choice of some basis of R3, but this time also Bob picks some basis *b*, so that we write *B* = *b* for his choice. Similar to Alice's outcome *F* = λ we denote Bob's by *G* = γ, and quantum mechanics provides all (Born) probabilities

$$P\_{\mathfrak{V}\_{0}}(F=\mathring{\lambda}, G=\gamma|A=a, B=b) \equiv p\_{J^{2}\_{\mathfrak{u}\_{1}}, J^{2}\_{\mathfrak{u}\_{2}}, J^{2}\_{\mathfrak{v}\_{1}}, J^{2}\_{\mathfrak{v}\_{2}}, J^{2}\_{\mathfrak{v}\_{3}}}(\mathring{\lambda}\_{1}, \mathring{\lambda}\_{2}, \mathring{\lambda}\_{3}, \mathring{\eta}\_{1}, \mathring{\eta}\_{2}, \mathfrak{v}\_{3}),$$

which are well defined because Alice's squared angular momentum operators *J*<sup>2</sup> u1 commute with Bob's *J*<sup>2</sup> <sup>v</sup><sup>1</sup> as a consequence of Einstein locality (stating that spacelike separated observables commute). Note that similarly to *a* = (u1,u2,u3) for Alice's basis, we write *b* = (v1,v2,v3) for Bob's. If Alice merely measures *Fi* whilst Bob measures *Gj*, then, as in the previous section, it does not matter which other (commuting) operators are measured and/or whether Alice and Bob know about this, cf. (6.25). Thus we may write either (*A* = *a*,*B* = *b*) or *Ai* = u*i*,*Bi* = v*<sup>i</sup>* for the settings, and simple calculations show that the Born probabilities are given by:

$$P\_{\Psi0} \left( F\_l = 1, G\_j = 1 | A = a, B = b \right) = \frac{1}{3} (1 + \left< \mathbf{u}\_l, \mathbf{v}\_j \right>^2);\tag{6.35}$$

$$P\_{\Psi0}(F\_l = 0, G\_j = 0 | A = a, B = b) = \frac{1}{3} \langle \mathbf{u}\_l, \mathbf{v}\_j \rangle^2;\tag{6.36}$$

$$P\_{\Psi0} \left( F\_l = 1, G\_f = 0 | A = a, B = b \right) = \frac{1}{3} (1 - \langle \mathbf{u}\_l, \mathbf{v}\_f \rangle^2);\tag{6.37}$$

$$P\_{\Psi0} \left( F\_l = 0, G\_j = 1 \middle| A = a, B = b \right) = \frac{1}{3} (1 - \left\langle \mathbf{u}\_l, \mathbf{v}\_j \right\rangle^2), \tag{6.38}$$

where u*i*,v*j*<sup>2</sup> <sup>=</sup> |u*i*,v*j*|2, etc., since the vectors are real, In terms of the notation

$$P\_{\Psi\_0}(F\_l = G\_f|\cdot) = P\_{\Psi\_0}(F\_l = 0, G\_f = 0|\cdot) + P\_{\Psi\_0}(F\_l = 1, G\_f = 1|\cdot);\quad(6.39)$$

$$P\_{\Psi\_0}(F\_l \neq G\_j | \cdot) = P\_{\Psi\_0}(F\_l = 0, G\_j = 1 | \cdot) + P\_{\Psi\_0}(F\_l = 1, G\_j = 0 | \cdot), \quad (6.40)$$

this yields

$$P\_{\Psi0}(F\_l = G\_j | A = a, B = b) = \frac{1}{3} (1 + 2 \langle \mathbf{u}\_l, \mathbf{v}\_j \rangle^2);\tag{6.41}$$

$$P\_{\Psi0} \left( F\_l \neq G\_j | A = a, B = b \right) = \frac{2}{3} (1 - \langle \mathbf{u}\_l, \mathbf{v}\_j \rangle^2). \tag{6.42}$$

The crucial point for the Free Will Theorem is that this implies *perfect correlation*:

$$P\_{\Psi0}(F\_l = G\_j | A\_l = B\_j) = 1,\tag{6.43}$$

in agreement with the intuition about angular momentum expressed earlier.

We now move to a (possibly counterfactual) deterministic description of this experiment along the lines of the previous section. It is straightforward to adapt all of Definition 6.11 except Non-contextuality (which after all is the assumption we would like to get rid of!). With the obvious changes, we obtain:

• *Determinism* again *first* claims there is a state space *X* with associated functions

$$A: X \to X\_{\mathcal{A}};\tag{6.44}$$

$$B: X \to X\_B;\tag{6.45}$$

$$F: X \to \Lambda;\tag{6.46}$$

$$G: X \to \Lambda,\tag{6.47}$$

where *XA* = *XB* is the set of all bases in R3, and Λ is some set of possible outcomes, which completely describe the experiment in the sense that each state *x* ∈ *X* determines both its settings (*a* = *A*(*x*),*b* = *B*(*x*)) and its outcome (λ = *F*(*x*), γ = *G*(*x*)). Here *A* = (*A*1,*A*2,*A*3) and *B* = (*B*1,*B*2,*B*3) where the functions *Ai* : *<sup>X</sup>* <sup>→</sup> *<sup>S</sup>*<sup>2</sup> (where *<sup>S</sup>*<sup>2</sup> is seen as the space of unit vectors in <sup>R</sup>3) combine to define a basis (similarly for *Bj* : *<sup>X</sup>* <sup>→</sup> *<sup>S</sup>*2), and *<sup>F</sup>* = (*F*1,*F*2,*F*3). *Secondly*, there exists some set *XZ* and an additional function *Z* : *X* → *XZ* such that

$$F = F(A, B, Z);\tag{6.48}$$

$$G = G(A, B, Z),\tag{6.49}$$

in that for each *x* ∈ *X* one has the functional relationships

$$F(\mathbf{x}) = \hat{F}(A(\mathbf{x}), B(\mathbf{x}), Z(\mathbf{x}));\tag{6.50}$$

$$G(\mathbf{x}) = \hat{G}(A(\mathbf{x}), B(\mathbf{x}), Z(\mathbf{x})),\tag{6.51}$$

for certain functions *<sup>F</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup> *XB* <sup>×</sup> *XZ* <sup>→</sup> <sup>Λ</sup> and *<sup>G</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup> *XB* <sup>×</sup> *XZ* <sup>→</sup> <sup>Λ</sup>, each of which is a triple *F*ˆ = (*F*ˆ <sup>1</sup>,*F*ˆ <sup>2</sup>,*F*ˆ <sup>3</sup>) with *F*ˆ *<sup>i</sup>* : *XA* × *XB* × *XZ* → R, etc. The value *z* = *Z*(*x*) is just the traditional "hidden variable" (which is often denoted by λ).

	- Λ is given by (6.22), i.e. *Fi* and *Gj*, and hence *F*ˆ *<sup>i</sup>* and *<sup>G</sup>*<sup>ˆ</sup> *<sup>j</sup>* take values in {0,1};
	- The experiment measures *squares* of angular momenta, so that

$$
\hat{F}(a',b',\mathbf{z}) = \hat{F}(a,b,\mathbf{z});\tag{6.52}
$$

$$
\hat{G}(a',b',z) = \hat{G}(a,b,z),
\tag{6.53}
$$

whenever (*a* ,*b* ) differ from (*a*,*b*) by changing the sign of any basis vector;

– *Perfect correlation* obtains, cf. (6.43), i.e., writing *a* = (u1,u2,u3) for Alice's basis and *b* = (v1,v2,v3) for Bob's, one has

$$\mathbf{u}\_{l} = \mathbf{v}\_{f} \Rightarrow \hat{F}\_{l}(a, b, z) = \hat{G}\_{f}(a, b, z). \tag{6.54}$$

We now come to the locality condition that is to replace *Non-contextuality*. This condition was first clearly stated by Bell (1964, p. 196), who attributes it to Einstein:

'The vital assumption is that the result *G* for particle 2 does not depend on the setting *a* of the magnet for particle 1, nor *F* on *b*.'

Noting various other notions of locality (such as *Einstein locality* in local quantum physics, which requires spacelike separated operators to commute, or *Bell locality*, discussed below), the above idea might be called *Context locality*, but we will simply refer to it as *Locality*. In our deterministic setting, a precise formulation is this:

• *Locality* means that *F*(*A*,*B*,*Z*) is independent of *B* and *G*(*A*,*B*,*Z*) is independent of *A*. In other words, we have *F* = *F*(*A*,*Z*) and *G* = *G*(*B*,*Z*), so that (with slight abuse of notation) *<sup>F</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup>*XZ* <sup>→</sup> <sup>Λ</sup> and *<sup>G</sup>*<sup>ˆ</sup> : *XB* <sup>×</sup>*XZ* <sup>→</sup> <sup>Λ</sup>, or, then again, *<sup>F</sup>*(*x*) = *<sup>F</sup>*ˆ(*A*(*x*),*Z*(*x*)) and *<sup>G</sup>*(*x*) = *<sup>G</sup>*ˆ(*B*(*x*),*Z*(*x*)), for each *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*.

This finally brings us to (our reformulation of) the *Free Will Theorem*:

#### Theorem 6.13. *Determinism, Freedom, Nature, and Locality are contradictory.*

*Proof.* The *Freedom* assumption allows us to treat (*a*,*b*,*z*) as free variables, a fact that will tacitly be used all the time. First, taking *i* = *j* in (6.54) shows that *F*ˆ *<sup>i</sup>*(u1,u2,u3,*z*) only depends on (u*i*,*z*), whilst *G*ˆ *<sup>j</sup>*(v1,v2,v3,*z*) only depends on (v*j*,*z*). Hence we write *F*ˆ *<sup>i</sup>*(*a*,*z*) = *F*˜ *<sup>i</sup>*(u*i*,*z*), etc. Next, taking *i* = *j* in (6.54) shows that *F*˜ <sup>1</sup>(u,*z*) = *F*˜ <sup>2</sup>(u,*z*) = *F*˜ <sup>3</sup>(u,*z*). Consequently, the function *<sup>F</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup>*XZ* <sup>→</sup> *XF* is given by (6.32). We are now back to the proof of Corollary 6.12, concluding that such a function does not exist by Corollary 6.8. -

## 6.3 Philosophical intermezzo: Free will in the Free Will Theorem

'The determinism-free will controversy has all of the earmarks of a dead problem. The positions are well staked out and the opponents manning them stare at each other in mutual incomprehension.' (Earman, 1986, p. 235)

The question arises which specific notion of free will is among the assumptions of the FWT (in the reformulation just given). To put this question in perspective, let us briefly recall the main point of the debate about free will. This concept has two poles. One is the "will" itself, requiring a sense of *agency*, deliberation, and control. This pole seems to require some form of determinism. A powerful expressions is:

'Furst! Was Sie sind, sind Sie durch Zufall und Geburt. Was ich bin, bin ich durch mich.' ¨ <sup>1</sup> (Beethoven, to his benefactor (!) Prince Lichnowsky)

The other pole of free will is the adjective "free", i.e., *the ability to do otherwise*, which at first sight requires indeterminism. *The problem of free will is that these poles seem contradictory*. Many authors conflate free will with moral responsibility:

'free will can be defined as the unique ability of persons to exercise control over their conduct in the manner necessary for moral responsibility.' (McKenna & Coates, 2015)

This aspect is irrelevant to our discussion, concerned as it is with the question what it would mean for Alice and Bob to choose their settings "freely" if determinism is assumed (it would have been different if one setting launched a nuclear missile). Even in our narrow context, the traditional philosophical stances are relevant:

	- Reconceptualizing "the ability to do otherwise" in a deterministic world. This will be our focus in what follows, especially in a version inspired by Lewis.
	- Belittling the relevance of "the ability to do otherwise", as e.g. by Dennett:

'So if anyone at all is interested in the question of whether one could have done otherwise in *exactly* the same circumstances (and internal state) this will have to be a particularly pure metaphysical curiosity—that is to say, a curiosity so pure as to be utterly lacking in any ulterior motive, since the answer could not conceivably make any noticeable difference to the way the world went.' (Dennett, 1984, p. 559).

	- *Libertarianism*, arguing that free will requires an indeterministic world.
	- *Hard determinism*, claiming determinism (which is assumed) blocks free will:

'Ein Mensch kann zwar tun was er will, aber nicht wollen was er will.'2 (Schopenhauer)

– *Hard incompatibilism*, asserting that 'every way you look at it you lose': free will makes no sense in either a deterministic or an indeterministic world.

<sup>1</sup> 'Lord! What you are, you are through chance and birth. What I am, I am because of myself.'

<sup>2</sup> 'One can admittedly do what one wants, but one cannot want what one wants.'

Although hard incompatibilism has our sympathy, our opening question concerning the notion of free will in the FWT drives us into the compatibilist direction, since determinism is among the assumptions shown to be contradictory by Theorem 6.13. Within compatibilism, we will be close to the well-known 'local miracle' variant thereof proposed by the philosopher David Lewis. Like other compatibilists before him (starting at least with G.E. Moore), Lewis attempts to make sense of the intuition that even in a deterministic world one in principle has the ability to act differently from the way one actually does, despite the fact that the latter was predetermined. A simple example is Alice's choosing setting *a* by moving her hand in a certain way, although she was able to choose *a* . On the other hand, she could not have moved her hand with a speed greater than that of light, so her ability remains constrained by the laws of nature. Lewis asks us to distinguish between:


The latter is impossible, but the former is not on Lewis's own theory of counterfactuals, according to which the phrase 'if I did it' leads us to consider the possible world in which doing 'something' is actually true, whilst in the possible worlds under consideration as many other features as possible are kept the same as in the actual world (the precise underlying measure of similarity is not important here). Thus the phrase 'a law would be broken' refers to the laws of the actual world (in which the alternative action is not realized). It seems to be of great importance to Lewis that in the first case it is not the agent who would break a law; instead, it is the breaking of some law of our actual world at an earlier time that enables the subject to do in an alternative possible world what she could not do in our actual world, .

By making this distinction, Lewis claims that he invalidates the seemingly lethal *Consequence Argument* against compatibilist free will, of which a simple version reads (assuming determinism, on which compatibilist free will is predicated):


Lewis claims that statement 3 is ambiguous, in that it fails to distinguish between the two senses in his two bullet points above. The Consequence Argument requires the latter (which is false), whereas this argument itself is unsound on the former (which is true). This disambiguation of assumption 3 in the Consequence Argument, then, is supposed to save (compatibilist) free will. However, a considerable philosophical literature suggests that the tension between Lewis's denying the second bullet point whilst accepting the first is pretty uncomfortable, reflecting the corresponding tension between the conjunction of determinism and freedom in general; indeed, this is what the FWT makes precise! Let us first point out that, at least in his terminology Lewis fails to make a clear distinction between *laws of nature* and *initial states*; from the point of view of modern physics, this distinction is absolutely fundamental (although it may dispappear in post-modern physics based in e.g. quantum gravity).

Lewis's examples of law-breaking events in our actual world typically refer to violations of some law of nature (like exceeding the speed of light), whereas the (alleged) law-breaking in his counterfactuals, such as choosing *a* (where in fact Alice did not do so) amounts to a change in some earlier state. Thus it might have been more appropriate if the paper in which Lewis laid out his version of compatibalism had been entitled *Are we free to change the states?* instead of *Are we free to break the laws?*. On this revision, his distinction of the two cases takes the following form:


The latter remains impossible, while it is the former that enables free will. Applied to Alice, the former should mean (still in the compatibilist spirit of Lewis):

• A slight alteration in the state of the actual world (which would have made it a different but very similar world according to Lewis) would have led Alice to do something (such as choosing *a* ) that she did not do in the actual world (because according to determinism its actual state at any earlier time—as opposed to the counterfactual alternative state in the discussion—led her to choose *a*).

We now make this revised version of Lewis's local miracle compatibilism mathematically precise, in a way that has the additional advantage of involving not only "the ability do do otherwise", but also the other component free will, i.e. agency. Here the intuition is that free will involves a separation between the agent, Alice, (who is to exercise it) and the rest of the world, under whose influence she acts. Namely, as in the FWT, let *X* be the state space of the Universe, and let

$$a = A(\mathbf{x})\tag{6.55}$$

again be Alice's setting, where *A* : *X* → *XA*, as before. We now assume that *a* is determined by her "inner state" *I* as well as the "outer state" *O* of the rest of the world, under whose influence she acts. These, in turn, are determined by the state *x* ∈ *X* of the world. That is, *A* = *A*(*O*,*I*), which expresses the existence of functions

$$O: X \to X\_O;\tag{6.56}$$

$$I \, : \, X \to X\_I \,, \tag{6.57}$$

$$
\hat{A}: X\_O \times X\_I \to X\_A,\tag{6.58}
$$

where *XO* and *XI* are certain sets, such that for each *x* ∈ *X* one has

$$A(\mathbf{x}) = \hat{A}(O(\mathbf{x}), I(\mathbf{x})).\tag{6.59}$$

In other words, for some given state *x* of the world we have

$$o = O(\mathfrak{x});\tag{6.60}$$

$$i = I(\mathbf{x});\tag{6.61}$$

$$a = \hat{A}(o, i). \tag{6.62}$$

Note that, in the spirit of Conway and Kochen, in the above analysis Alice (whose free choice they after all believe to be ultimately a consequence of the free choice of elementary particles) now plays the role of the spin-1 particles in the bipartite experiment. Thus the analogy is between the triples:

$$(a, z, \lambda) \in \mathcal{X}\_{\mathbf{A}} \times \mathcal{Z} \times \Lambda;\tag{6.63}$$

$$
\Psi(o, i, a) \in X\_O \times X\_I \times X\_A. \tag{6.64}
$$


Beyond *Determinism*, which is expressed by the above framework, our fundamental assumption underpinning compatibilist free will is *Freedom*, defined exactly as in the FWT: *O* and *I* are *independent* in that the following function is surjective:

$$O \times I: X \to X\_O \times X\_I$$

$$x \mapsto (O(x), I(x)), \tag{6.65}$$

i.e., for each pair (*o*,*i*) ∈ *XI* ×*XO* there is *x* ∈ *X* for which (6.60) and (6.61) hold.

Rephrasing our earlier analysis in this elementary mathematical language, Lewis wants to make sense of the idea that although Alice's choice (6.62) at some fixed time *t* was determined by the state *x* of the Universe at that time through (6.60) - (6.61), or, equivalently, through (6.59), and hence—and this is the whole point of the Consequence Argument Lewis challenges—by any earlier state *xp* of the Universe at time *tp*, *nonetheless* Alice was "able to act otherwise" at time *t*, e.g. in choosing

$$a' = \hat{A}(o', i'),\tag{6.66}$$

but did not do so, since choosing *a* would illegally have changed the state *x* to *x* (both at time *t*), and, equivalently (given determinism), would have changed *xp* to *x <sup>p</sup>*. On our reading of Lewis's theory of counterfactuals, Alice's ability to choose *a* simply means that there exists a state *x* of the world close to *x* in the sense that

$$O(\mathbf{x'}) = O(\mathbf{x}) = o,\tag{6.67}$$

making the environment in which Alice acts the same as in the actual world, but

$$\mathbf{i}' = I(\mathbf{x}') \neq I(\mathbf{x}) = \mathbf{i},\tag{6.68}$$

where *i* should be close to *i* in some appropriate sense (such as a slight change in the state of Alice's brain), such that (6.66) holds, with *o* = *o* as required by (6.67).

The point, then, is that according to our *Freedom* assumption, there indeed *is* such a nearby state *x* , for any given *i* and (*o*,*i*). Thus the freedom Alice has is precisely what we have formalized as *Freedom*: even *given* the state *o* of the causal influences on her behaviour (and possibly even the entire state of the rest of the world), there is a different admissible state *x* of the world such that, had this state been actual, she would have chosen *a* (although she in fact, necessarily, picked *a*).

It should be clear now that at least in the context of the Free Will Theorem, our precise technical formulation of all assumptions implies that the freedom Alice and Bob have in choosing their settings is an instance of the local miracle compatibilist form of free will proposed by Lewis (1981), at least if one accepts our reformulation thereof. The theorem then establishes a contradiction between:


Accepting the former, the latter must fall. Making this choice, one should realize that the physics assumptions on the one hand just form a small corner of modern physics (from which point of view they are weak), but on the other hand have singled out the corner in which the two fundamental theories of quantum mechanics and special relativity meet and are brought to a head (from which perspective they are strong).

The challenge their theorem puts to compatibalism was recognized by Conway & Kochen (2009), who write:

'The tension between human free will and physical determinism has a long history. Long ago, Lucretius made his otherwise deterministic particles swerve unpredictably to allow for free will. It was largely the great success of deterministic classical physics that led to the adoption of determinism by so many philosophers and scientists, particularly those in fields remote from current physics. (This remark also applies to "compatibilism", a now unnecessary attempt to allow for human free will in a deterministic world.)'

This quotation does not use a precise version of compatibilism, but, as Conway explains elsewhere, what they mean is that compatibilism in whatever form was a desperate pre-twentieth-century attempt to save the notion of free will for e.g. Christianity in the face of the physics of the time, which assumed that the universe was a mechanical clockwork. Such attempts, then, would no longer be necessary if the world is, in fact, indeterministic (as Conway and Kochen claim to have at last proved). Our reformulation of their theorem (which removes the threat of circularity) gives a more subtle picture: the FWT uses modern physics to challenge one particular version of *compatibilist free will*. As such, it only provides indirect support for *libertarian free will*, namely by weakening one of its competitors.

To close this philosophical intermezzo, let us note that determinism is seen as a property of *theories*. Since it is the job of a deterministic theory to predict the outcome of any experiment, whether or not it is performed, this obviates the need for assumptions like counterfactuality in the sense that 'unperformed experiments have results' (which was famously denied by Asher Peres). Such controversial notions of counterfactuality have effectively been replaced by the considerably more refined modal counterfactuality of Lewis (at least in our slight reformulation thereof).

#### 6.4 Technical intermezzo: The GHZ-Theorem

The essence of the proof of the Free Will Theorem lies in the argument that perfect correlation together with context-locality implies non-contextuality. Remarkably, context-locality is at the same time a special case of non-contextuality, as the following example illustrates. We take *<sup>H</sup>* <sup>=</sup> <sup>C</sup><sup>2</sup> <sup>⊗</sup>C2, equipped with the *Bell basis*

$$\mathfrak{u}\_0 = (|01\rangle - |10\rangle)/\sqrt{2};\tag{6.69}$$

$$\mathfrak{v}\_{\mathbb{I}} = (|01\rangle + |10\rangle)/\sqrt{2};\tag{6.70}$$

$$\mathfrak{v}\_2 = (|00\rangle - |11\rangle)/\sqrt{2};\tag{6.71}$$

$$\mathfrak{v}\_{\mathfrak{B}} = (|00\rangle + |11\rangle)/\sqrt{2},\tag{6.72}$$

where we use the physicists' notation

$$|1\rangle = (1,0);\tag{6.73}$$

$$|0\rangle = (0,1);\tag{6.74}$$

$$|ij\rangle = |i\rangle \otimes |j\rangle. \tag{6.75}$$

Of course, <sup>C</sup><sup>2</sup> <sup>⊗</sup> <sup>C</sup><sup>2</sup> <sup>∼</sup><sup>=</sup> <sup>C</sup><sup>4</sup> contains the spin-1 Hilbert space <sup>C</sup><sup>3</sup> of the Kochen– Specker Theorem as the subspace orthogonal to the vector υ0. Thus we identify C<sup>3</sup> with the subspace C˜ <sup>3</sup> of C<sup>4</sup> spanned by the basis vectors υ1,υ2,υ3. The operators

$$J\_{\mathbf{u}} = \frac{1}{2} (\mathfrak{sigma}\_{\mathbf{u}} \otimes 1\_2 + 1\_2 \otimes \mathfrak{w}\_{\mathbf{u}}),\tag{6.76}$$

where <sup>u</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> is a unit vector as before, and

$$
\sigma\_{\mathbf{u}} = \sum\_{i=1}^{3} \sigma^{i} u\_{i} \tag{6.77}
$$

in terms of the Pauli matrices σ*<sup>i</sup>* , map υ<sup>1</sup> to zero and leave its orthogonal complement C˜ <sup>3</sup> stable. Elementary group theory or direct calculation then shows that the operator *J*<sup>u</sup> on C<sup>3</sup> in (6.11) is (unitarily) equivalent to the operator *J*˜ <sup>u</sup> on C˜ 3. Since

$$J\_\mathbf{u}^2 = \frac{1}{2} (\sigma\_\mathbf{u} \otimes \sigma\_\mathbf{u} + 1\_2 \otimes 1\_2),\tag{6.78}$$

the Kochen–Specker argument can be rephrased in terms of the operators σ<sup>u</sup> ⊗σu. In particular, for each frame *a* = (u1,u2,u3), the three operators

$$(\sigma\_{\mathbf{u}\_1} \otimes \sigma\_{\mathbf{u}\_1}, \sigma\_{\mathbf{u}\_2} \otimes \sigma\_{\mathbf{u}\_2}, \sigma\_{\mathbf{u}\_3} \otimes \sigma\_{\mathbf{u}\_3})\tag{6.79}$$

commute, they each square to one, and their joint eigenvalues are one of the triples:

$$(-1, -1, -1), (-1, 1, 1), (1, -1, 1), (1, 1, -1).$$

The eigenvector corresponding to the first one is υ0, and hence the others must lie in C˜ 3. Hence by Lemma 6.4 any quasi-linear non-contextual hidden variable must also assign these values, which by Lemma 6.7 is impossible for arbitrary bases.

The key mathematical property of the three operators (6.79) is that they commute, and together with the unit 12 ⊗ 12 form a maximal set of commuting self-adjoint matrices on C4. But other such sets could have been chosen by Alice (under whose sole control the situation so far has been assumed to be), such as a triple of the kind

$$(\mathfrak{o\_u} \otimes \mathfrak{l}\_2, \mathfrak{l}\_2 \otimes \mathfrak{o\_v}, \mathfrak{o\_u} \otimes \mathfrak{o\_v}),$$

where u and v are arbitrary unit vectors in R3. Since the third operator is the product of the first two, the joint eigenvalues of this triple, and hence also the assignments by a quasi-linear non-contextual hidden variable, must be one of the four triples

$$(1,1,1), (-1,1,-1), (1,-1,-1), (-1,-1,1).$$

The non-contextuality assumption would then dictate that the outcome of Alice's measurement of σ<sup>u</sup> ⊗12 be independent of her choice of the setting v in a possible simultaneous measurement of 12 ⊗ σv, and *vice versa*. Therefore, in a (non-local) bipartite setting where Alice is only able to measure operators of the type *a* ⊗ 12, whilst Bob can measure 12 ⊗ *b*, on the above choice of (commuting) operators, *non-contextuality in the situation where Alice controls everything is mathematically equivalent to (context) locality in the bipartite Alice & Bob setting*.

Further constraints then arise if the system is prepared in a correlated state like ψ0, which is an eigenstate of σ<sup>u</sup> ⊗σ<sup>v</sup> with eigenvalue −1 whenever u = v. So in that case the values of (σ<sup>u</sup> ⊗12,12 ⊗σv) can only be (1,−1) or (−1,1), yielding perfect anti-correlation. This is not enough, however, to derive a Free Will Theorem; to do so with the small single-site Hilbert space C2, one needs a third (non-local) party.

Indeed, the well-known tripartite GHZ-argument may be rephrased as a Free Will Theorem, as follows. The underlying Hilbert space is

$$H = \mathbb{C}^2 \otimes \mathbb{C}^2 \otimes \mathbb{C}^2 \cong \mathbb{C}^8,\tag{6.80}$$

and hence as a warm-up we first (re)prove Theorem 6.5 for *n* = 8. Suppose we have a map *V* : *H*8(C) → R as in Definition 6.1. Write

$$
\lambda\_1^{(a)} = V(\mathfrak{o}\_a \otimes 1\_2 \otimes 1\_2), \\
\lambda\_2^{(b)} = V(1\_2 \otimes \mathfrak{o}\_b \otimes 1\_2), \\
\lambda\_3^{(c)} = V(1\_2 \otimes 1\_2 \otimes \mathfrak{o}\_c),
$$

where *a*,*b*, *c* can be 1,2,3. From Lemma 6.4 we then have

$$V(\sigma\_1 \otimes \sigma\_2 \otimes \sigma\_2) = \lambda\_1^{(1)} \lambda\_2^{(2)} \lambda\_3^{(2)};\tag{6.81}$$

$$V(\mathfrak{o}\_2 \otimes \mathfrak{o}\_1 \otimes \mathfrak{o}\_2) = \mathfrak{A}\_1^{(2)} \mathfrak{A}\_2^{(1)} \mathfrak{A}\_3^{(2)};\tag{6.82}$$

$$V(\sigma\_2 \otimes \sigma\_2 \otimes \sigma\_1) = \lambda\_1^{(2)} \lambda\_2^{(2)} \lambda\_3^{(1)};\tag{6.83}$$

$$V(\sigma\_1 \otimes \sigma\_1 \otimes \sigma\_1) = \lambda\_1^{(1)} \lambda\_2^{(1)} \lambda\_3^{(1)}.\tag{6.84}$$

Furthermore, the four operators on the left-hand side commute and turn out to satisfy

$$
\mathfrak{G}\_1 \otimes \mathfrak{G}\_2 \otimes \mathfrak{G}\_2 \cdot \mathfrak{G}\_2 \otimes \mathfrak{G}\_1 \otimes \mathfrak{G}\_2 \cdot \mathfrak{G}\_2 \otimes \mathfrak{G}\_2 \otimes \mathfrak{G}\_1 = -\mathfrak{G}\_1 \otimes \mathfrak{G}\_1 \otimes \mathfrak{G}\_1,\tag{6.85}
$$

so that again by Lemma 6.4,

$$
\lambda\_1^{(1)}\lambda\_2^{(2)}\lambda\_3^{(2)}\cdot\lambda\_1^{(2)}\lambda\_2^{(1)}\lambda\_3^{(2)}\cdot\lambda\_1^{(2)}\lambda\_2^{(2)}\lambda\_3^{(1)} = -\lambda\_1^{(1)}\lambda\_2^{(1)}\lambda\_3^{(1)},\tag{6.86}
$$

i.e. (λ(1) <sup>1</sup> <sup>λ</sup>(2) <sup>2</sup> <sup>λ</sup>(2) <sup>3</sup> )<sup>2</sup> <sup>=</sup> <sup>−</sup>1. Since <sup>λ</sup>(*i*) *<sup>j</sup>* = ±1, this is impossible, so that *V* cannot exist. Now, using the notation in the preceding discussion, consider the unit vector

$$
\Psi\_{GHZ} = (|111\rangle - |000\rangle)/\sqrt{2},\tag{6.87}
$$

which is a joint eigenstate of each of the four operators on the left-hand side of (6.81) - (6.84), with eigenvalue +1 for the first three, and hence eigenvalue −1 for the fourth, i.e., σ<sup>1</sup> ⊗ σ<sup>1</sup> ⊗ σ1. So if setting *A* = *a* for Alice (where *a* ∈ {1,2}) means that she measures *<sup>F</sup>* <sup>=</sup> <sup>σ</sup>*<sup>a</sup>* <sup>⊗</sup><sup>12</sup> <sup>⊗</sup><sup>12</sup> with outcome <sup>λ</sup>(*a*) <sup>1</sup> = ±1, and similarly *B* = *b* for Bob and *C* = *c* for Cindy mean that they measure *G* = 12 ⊗σ*<sup>b</sup>* ⊗12 and *<sup>H</sup>* <sup>=</sup> <sup>12</sup> <sup>⊗</sup><sup>12</sup> <sup>⊗</sup>σ*<sup>c</sup>* with outcomes <sup>λ</sup>(*b*) <sup>2</sup> <sup>=</sup> <sup>±</sup>1 and <sup>λ</sup>(*c*) <sup>3</sup> = ±1, respectively, then in the state ψ*GHZ* each of the settings gives the correlation

$$\text{settings}\left(a,b,c\right) = (1,2,2), (2,1,2), (2,2,1) \Rightarrow \lambda\_1^{(a)}\lambda\_2^{(b)}\lambda\_3^{(c)} = 1; \qquad (6.88)$$

$$\text{setting}\left(a,b,c\right) = \left(1,1,1\right) \Rightarrow \mathcal{\lambda}\_1^{(a)}\mathcal{\lambda}\_2^{(b)}\mathcal{\lambda}\_3^{(c)} = -1. \quad (6.89)$$

Theorem 6.14. *The conjunction of the following assumptions is contradictory:*

• Determinism*: there is a state space X with associated functions*

$$A, B, C: X \to \{1, 2\}, F, G, H: X \to \Lambda, \emptyset$$

*which completely describes the experiment, in that x* ∈ *X determines both settings* (*a*,*b*, *<sup>c</sup>*) *and outcomes* (λ1,λ2,λ3) <sup>∈</sup> <sup>Λ</sup><sup>3</sup> *through a* <sup>=</sup> *<sup>A</sup>*(*x*)*,* <sup>λ</sup><sup>1</sup> <sup>=</sup> *<sup>F</sup>*(*x*)*, etc.*


$$F = F(A, B, C, Z), \; G = G(A, B, C, Z), \; H = H(A, B, C, Z),$$

*and F, G, H, Z are independent, i.e. for each* (*a*,*b*, *c*,*z*) *there is x* ∈ *X such that*

$$A(\mathbf{x}) = a, \; B(\mathbf{x}) = b, \; C(\mathbf{x}) = c, \; Z(\mathbf{x}) = z.$$

• Locality*: F* = *F*(*A*,*Z*)*, G* = *G*(*B*,*Z*)*, and H* = *H*(*C*,*Z*)*.*

*Proof.* Using notation as in the proof of Theorem 6.13, for fixed *z* ∈ *Z* we obtain *F*ˆ(*a*,*z*) = λ(*a*) <sup>1</sup> etc. *Nature* then leads to the contradiction derived after (6.86). -

#### 6.5 Bell's theorems

Two different results are known as "Bell's Theorem": the first, from his paper in 1964, is Theorem 6.15 below, and the second, dating from 1976, is Theorem 6.18. The first is similar to the Free Will Theorem in both its assumptions and its conclusion, and to make this similarity more obvious we first state it for C<sup>3</sup> instead of C2. The difference lies in the probabilistic flavour of Bell's Theorem, whose empirical input is not given by the only non-probabilistic consequence to be drawn from the quantum-mechanical formulae (6.35) - (6.38), viz. the certainty (6.43) of perfect correlation on identical settings, but rather by the probabilistic formula (6.40), i.e.,

$$P\_{\Psi0}(F\_i \neq G\_j | A\_i = \mathbf{u}\_i, B\_j = \mathbf{v}\_j) = \frac{2}{3} \sin^2 \theta\_{\mathbf{u}\_i, \mathbf{v}\_j} \ (i, j = 1, 2, 3), \tag{6.90}$$

where θu,<sup>v</sup> is the angle between two unit vectors u and v. Furthermore, the state space *X* must be upgraded to a probability space (*X*,Σ,μ), carrying functions *A* and *B* (for the settings, which unlike Bell himself—who treated them as labels we include among the random variables), *F* and *G* (for the outcomes) and finally *Z* (for the hidden variable traditionally called λ) as random variables, i.e., measurable functions. This also implies that the target spaces *XA* to *XZ* (which is traditionally called Λ) must be equipped with some σ-algebra of measurable subsets. But this is not a big deal, since *XA* = *XB* carries a natural Borel structure and *XF* = *XG* is finite. The probability measure μ is assumed independent of (*A*,*B*,*F*,*G*), and *vice versa*.

The measure μ, which gives the "hidden state" of the system that allegedly underlies its quantum-mechanical description, is chosen in such a way that empirical probabilities (typically obtained from long runs of repeated measurements) are recovered as joint conditional probabilities defined by μ and the random variables, i.e., assuming the settings (*a*,*b*) are possible in that *P*(*A* = *a*,*B* = *b*) > 0, we put

$$P(F=\lambda, G=\gamma|A=a, B=b) = \frac{P(F=\lambda, G=\gamma, A=a, B=b)}{P(A=a, B=b)},\qquad(6.91)$$

where the joint probabilities on the right-hand side are given by

$$P(A=a, B=b) = \mu(A=a, B=b);\tag{6.92}$$

$$P(F=\lambda, G=\gamma, A=a, B=b) = \mu(F=\lambda, G=\gamma, A=a, B=b), \quad (6.93)$$

where μ(*A* = *a*,*B* = *b*) is shorthand for μ(*x* ∈ *X* | *A*(*x*) = *a*,*B*(*x*) = *b*}, etc. This implies that μ depends on (but may not be determined by) the quantum state ψ0.

On this understanding, the assumptions of *Determinism* and *Locality* are the same as for the Free Will Theorem (except that equations like *F*(*x*) = *F*ˆ(*A*(*x*),*Z*(*x*)) are merely supposed to hold almost everywhere with respect to μ). *Freedom* is now taken to mean that (*A*,*B*,*Z*) are *probabilistically independent* relative to μ. By definition, this also means that the pairs (*A*,*B*), (*A*,*Z*), and (*B*,*Z*) are independent, so that for any A ⊂ *XA*, B ⊂ *XB*, and (measurable) Z ⊂ *XZ*, defining

$$P(A \in \mathbb{A}, B \in \mathbb{B}, Z \in \mathbb{Z}) = \mu(\mathbf{x} \in X \mid A(\mathbf{x}) \in \mathbb{A}, B(\mathbf{x}) \in \mathbb{B}, Z(\mathbf{x}) \in \mathbb{Z}),\tag{6.94}$$

and analogous expressions for *P*(*A* ∈ A) and *P*(*A* ∈ A,*B* ∈ B), etc., we have

$$P(A \in \mathbb{A}, B \in \mathbb{B}) = P(A \in \mathbb{A})P(B \in \mathbb{B});\tag{6.95}$$

$$P(\mathcal{A}\in\mathsf{A}, \mathcal{Z}\in\mathsf{Z}) = P(\mathcal{A}\in\mathsf{A})P(\mathcal{Z}\in\mathsf{Z});\tag{6.96}$$

$$P(\mathcal{B}\in\mathcal{B}, \mathbb{Z}\in\mathbb{Z}) = P(\mathcal{B}\in\mathbb{B})P(\mathcal{Z}\in\mathbb{Z});\tag{6.97}$$

$$P(A \in \mathbb{A}, B \in \mathbb{B}, Z \in \mathbb{Z}) = P(A \in \mathbb{A})P(B \in \mathbb{B})P(Z \in \mathbb{Z}).\tag{6.98}$$

If we finally define *Nature* as the claim that *F*ˆ and *G*ˆ are 2-valued and that

$$P(F\_i \neq G\_j | A\_i = \mathbf{u}\_i, B\_j = \mathbf{v}\_j) = \frac{2}{3} \sin^2 \theta\_{\mathbf{u}\_i, \mathbf{v}\_j} \ (i, j = 1, 2, 3), \tag{6.99}$$

where the left-hand side is the *conditional* probability defined by μ and the random variables in question (whereas the left-hand side of (6.90) is the *empirical* probability for the experiment in question, or, equivalently, the quantum-mechanical prediction thereof), then we obtain the following spin-1 version of *Bell's first theorem*:

#### Theorem 6.15. *Determinism, Freedom, Nature, and Locality are contradictory.*

This formulation is literally the same as Theorem 6.13, but the terms have acquired a different technical meaning now, especially *Freedom* and *Nature*. Moreover, purists would add *Probability Theory* as an assumption in Bell's Theorem, as its formalism is decidedly non-tautological and its interpretation is far from obvious, even in a classical setting. In any case, the proof is practically the same as in the more familiar optical version of the EPR-experiment, to which we now turn.

In the classical (sic) form of the experiment, Alice and Bob perform measurements on incoming photons by letting them pass through a polaroid glass whose axis of polarization makes angle *a* (Alice) or *b* (Bob) with (say) the horizontal axis in the plane orthogonal to the direction of propagation of the photons. Considered in the light of the previous experiment on spin-1 particles, such a choice of settings may also be seen as a choice of basis for R3, with the proviso that, assuming (by convention) the photons move along the *y*-axis, one basis element u<sup>2</sup> = (0,1,0) is fixed so that the remaining two vectors (u1,u3) must lie in the *x*-*z* plane (in which, on a naive picture, the photons may "vibrate"). This constraint gives rise to bases

$$\mathbf{u}\_1 = (\cos a, 0, \sin a), \mathbf{u}\_2 = (0, 1, 0), \mathbf{u}\_3 = (-\sin a, 0, \cos a), \tag{6.100}$$

the first of which (say) gives the actual direction of the axis of polarization. In any case, Alice writes down *F* = 1 if her photon passes her glass at angle *a*, and *F* = 0 if it does not; similarly Bob writes *G* = 1 (pass) or *G* = 0 (fail) at setting *b*.

In a quantum-mechanical description of the experiment, the Hilbert space of the photon pair is <sup>C</sup><sup>2</sup> <sup>⊗</sup>C2, and the correlated photon state is taken to be

$$
\Psi\_0 = (\mathbf{e}\_1 \otimes \mathbf{e}\_1 + \mathbf{e}\_2 \otimes \mathbf{e}\_2) / \sqrt{2},\tag{6.101}
$$

where e<sup>1</sup> = (1,0) and e<sup>2</sup> = (0,1) form the standard basis of C2. The probabilities (6.35) - (6.38) as predicted by quantum mechanics are now replaced by

$$P\_{\Psi0}(F=1, G=1|A=a, B=b) = \frac{1}{2}\cos^2(a-b);\tag{6.102}$$

$$P\_{\Psi0}(F=0, G=0 | A=a, B=b) = \frac{1}{2}\cos^2(a-b);\tag{6.103}$$

$$P\_{\Psi0}(F=1, G=0 | A=a, B=b) = \frac{1}{2}\sin^2(a-b),\tag{6.104}$$

$$P\_{\Psi0}\left(\left(F=0, G=1\middle|A=a, B=b\right) = \frac{1}{2}\sin^2(a-b), \tag{6.105}$$

which are also the experimentally measured ones. Instead of (6.90) we then obtain

$$P\_{\Psi0}(F \neq G | A = a, B = b) = \sin^2(a - b);\tag{6.106}$$

$$P\_{\Psi0}(F=G|A=a, B=b) = \cos^2(a-b). \tag{6.107}$$

In particular, if their settings are the same (i.e., *a* = *b*), then Alice and Bob will always find the same outcome (*perfect correlation*), whereas in case they are orthogonal (i.e., *a* = *b* ± π/2), they obtain *perfect anti-correlation*, in that Alice's photon passes whenever Bob's is blocked, and *vice versa*. However, this will not be used. Although it should be obvious from the previous case what the assumptions in Theorem 6.15 mean for this particular experiment, we make them explicit:

• *Determinism* means that there is a probability space (*X*,Σ,μ) with associated (measurable) functions

$$A: X \to [0, \pi], B: X \to [0, \pi], F: X \to \{0, 1\}, G: X \to \{0, 1\}, \qquad (6.108)$$

which completely describe the experiment in the sense that *x* ∈ *X* determines *both* its settings *a* = *A*(*x*),*b* = *B*(*x*) *and* its outcomes λ = *F*(*x*), γ = *G*(*x*).

	- *F* = *F*(*A*,*B*,*Z*) and *G* = *G*(*A*,*B*,*Z*);
	- (*A*,*B*,*Z*) are probabilistically independent relative to μ.

$$P(F \neq G | A = a, B = b) = \sin^2(a - b). \tag{6.109}$$

Theorem 6.15 then holds *verbatim* for this situation, with the following proof.

*Proof. Determinism* and *Freedom* imply

$$P(F=\hat{\lambda}, G=\gamma|A=a, B=b) = P\_{ABZ}(\hat{F}=\lambda, \hat{G}=\gamma|\hat{A}=a, \hat{B}=b),\qquad(6.110)$$

where we use the notation (6.50) - (6.51), the function *<sup>A</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup> *XB* <sup>×</sup> *XZ* <sup>→</sup> *XA* is projection on the first coordinate, likewise the function *<sup>B</sup>*<sup>ˆ</sup> : *XA* <sup>×</sup> *XB* <sup>×</sup> *XZ* <sup>→</sup> *XB* is projection on the second, and *PABZ* is the joint probability on *XA* ×*XB* ×*XZ* induced by the triple (*A*,*B*,*Z*) and the probability measure μ; by independence, *PABZ* is a product measure on *XA* <sup>×</sup>*XB* <sup>×</sup>*XZ*. According to *Locality*, *<sup>F</sup>*ˆ(*a*,*b*,*z*) does not depend on *b*, whilst *G*ˆ(*a*,*b*,*z*) does not depend on *a*.

For fixed settings (*a*,*b*), we may therefore define the following functions on *XZ*:

$$
\hat{F}\_a(z) = \hat{F}(a, z);\tag{6.111}
$$

$$
\hat{G}\_b(z) = \hat{G}(b, z). \tag{6.112}
$$

A brief computation then yields

$$P\_{\rm ABZ}(\hat{F} = \lambda, \hat{G} = \gamma | \hat{A} = a, \hat{B} = b) = P\_{\rm Z}(\hat{F}\_a = \lambda, \hat{G}\_b = \gamma),\tag{6.113}$$

where *PZ* is the joint probability on *XZ* defined by *Z* and μ. Therefore, from (6.110),

$$P(F=\lambda, G=\gamma|A=a, B=b) = P\_{\mathbb{Z}}(\hat{F}\_a = \lambda, \hat{G}\_b = \gamma). \tag{6.114}$$

*Nature* then gives the crucial result

$$P\_Z(\hat{F}\_a \neq \hat{G}\_b) = \sin^2(a-b). \tag{6.115}$$

Lemma 6.16. *Any four* {0,1}*-valued random variables* (*F*1,*F*2,*G*1,*G*2) *satisfy*

$$P(F\_1 \neq G\_1) \le P(F\_1 \neq G\_2) + P(F\_2 \neq G\_1) + P(F\_2 \neq G\_2). \tag{6.116}$$

This lemma (said to go back to Boole) is very easy to prove directly, but for completeness's sake we mention that it also follows from Proposition 6.17 below.

Taking *F*<sup>1</sup> = *F*ˆ *<sup>a</sup>*<sup>1</sup> , *F*<sup>2</sup> = *F*ˆ *<sup>a</sup>*<sup>2</sup> , *<sup>G</sup>*<sup>1</sup> <sup>=</sup> *<sup>G</sup>*<sup>ˆ</sup> *<sup>b</sup>*<sup>1</sup> , *<sup>G</sup>*<sup>2</sup> <sup>=</sup> *<sup>G</sup>*<sup>ˆ</sup> *<sup>b</sup>*<sup>2</sup> , and *<sup>P</sup>* <sup>=</sup> *PZ*, for suitable values of (*a*1,*a*2,*b*1,*b*2) this inequality is violated by (6.115). Take, for example, *a*<sup>2</sup> = *b*<sup>2</sup> = 3*x*, *a*<sup>1</sup> = 0, and *b*<sup>1</sup> = *x*. The inequality (6.116) then assumes the form *f*(*x*) ≥ 0 for

$$f(\mathbf{x}) = \sin^2(3\mathbf{x}) + \sin^2(2\mathbf{x}) - \sin^2(\mathbf{x})\,.$$

But in fact, *f*(*x*) < 0 for continuously many values of *x* ∈ [0,2π], see plot. -

*Graph of x* → sin2(3*x*) +sin2(2*x*)−sin2(*x*)*, showing (in the region where it is negative) that quantum mechanics violates the Bell inequality* (6.116)*.*

Lemma 6.16 is a special case of a more general result.

Proposition 6.17. *Let Fi* : *X* → [−1,1] *and Gj* : *X* → [−1,1]*, where* (*X*,Σ,μ) *is some probability space, be two parametrized random variables, i*, *j* = 1,2*. Then the* two-point function *FiGj* <sup>=</sup> *<sup>X</sup> d*μ *FiGj satisfies the* CHSH-inequality

$$|\langle F\_1 G\_1 \rangle + \langle F\_1 G\_2 \rangle + \langle F\_2 G\_1 \rangle - \langle F\_2 G\_2 \rangle| \le 2. \tag{6.117}$$

If *Fi* and *Gj* just take the values ±1, then (6.116) is a special case of (6.117).

*Proof.* In terms of the function Φ = *F*<sup>1</sup> ·(*G*<sup>1</sup> +*G*2) +*F*<sup>2</sup> ·(*G*<sup>1</sup> −*G*2), we may write

$$
\langle \langle F\_1 G\_1 \rangle + \langle F\_1 G\_2 \rangle + \langle F\_2 G\_1 \rangle - \langle F\_2 G\_2 \rangle = \int\_X d\mu \, \Phi. \tag{6.118}
$$

Since |*Fi*(*x*)| ≤ 1 and |*Gj*(*x*)| ≤ 1 by assumption, we have |Φ(*x*)| ≤ 2 and hence

$$\left| \int\_{X} d\mu(\mathbf{x}) \,\Phi(\mathbf{x}) \right| \le \int\_{X} d\mu(\mathbf{x}) \, |\Phi(\mathbf{x})| \le 2,\tag{6.119}$$

since μ is a probability measure. To prove the the last claim, we just note that

$$\begin{aligned} P(F\_i = G\_j) - P(F\_i \neq G\_j) &= \langle F\_i G\_j \rangle; \\ P(F\_i = G\_j) + P(F\_i \neq G\_j) &= 1. \end{aligned}$$

In Bell's second (1976) theorem on *stochastic hidden variables*, the assumption of *Determinism* is dropped, and all we have is a theory stating conditional probabilities *P*(*F* = λ,*G* = γ|*A* = *a*,*B* = *b*, *x*) for the outcomes of the above bipartite experiment given some hidden variable *x*, as well as the single-wing versions *P*(*F* = λ|*A* = *a*) and *P*(*G* = γ|*B* = *b*, *x*). Here *F*,*G*,*A*,*B* are just notational devices to record such outcomes, *which are no longer (necessarily) represented as random variables*. On this new understanding of the notation, the *Nature* assumption is formulated just as before, cf. (6.109). We do assume the existence of a probability space (*X*,Σ,μ) and of conditional probabilities

$$P(F=\lambda, G=\gamma|A=a, B=b, \mathbf{x}), \ P(F=\lambda|A=a, \mathbf{x}), \ P(G=\gamma|B=b, \mathbf{x}),$$

defined μ-a.e. in *x*, in which the state of the world is specified as being *x* ∈ *X*. In terms of this space, the *Freedom* assumption means that

$$P(F=\lambda, G=\gamma|A=a, B=b) = \int\_X d\mu(\mathbf{x}) P(F=\lambda, G=\gamma|A=a, B=b, \mathbf{x}),\tag{6.120}$$

for any settings (*a*,*b*), of which μ is independent (as the notation already indicated).

The crucial assumption replacing *Determinism* is *Bell locality*, which reads

$$P(F=\mathbb{X}, G=\gamma|A=a, B=b, \mathbf{x}) = P(F=\mathbb{X}|A=a, \mathbf{x}) \cdot P(G=\gamma|B=b, \mathbf{x}) . \tag{6.121}$$

*Bell's second theorem* for stochastic hidden variable theories reads as follows.

#### Theorem 6.18. *Nature, Freedom, and Bell locality are contradictory.*

*Proof.* The idea of the proof is to introduce an artificial probability space in order to recover the framework of Theorem 6.15. To this end, we take

$$
\tilde{X} = [0, 1] \times [0, 1] \times X;\tag{6.122}
$$

$$d\tilde{\mu}(\mathbf{s}, t, \mathbf{x}) = d\mathbf{s} \cdot d\mathbf{t} \cdot d\mu(\mathbf{x}).\tag{6.123}$$

where we denoted the elements of *X*˜ by (*s*,*t*, *x*). On *X*˜, define random variables

$$\mathcal{F}\_a(\mathbf{s}, t, \mathbf{x}) = \mathbf{1}\_{[0, P(F=1|A=a, \mathbf{x})]}(\mathbf{s});\tag{6.124}$$

$$\tilde{G}\_b(\mathbf{s}, t, \mathbf{x}) = \mathbf{1}\_{[0, P(G=1|B=b, \mathbf{x})]}(t),\tag{6.125}$$

where 1<sup>Δ</sup> is the indicator function for Δ ⊆ [0,1]. Writing, as usual,

$$\tilde{P}(\tilde{F}\_a = \lambda, \tilde{G}\_b = \gamma) = \int\_{\mathcal{X}} d\mathfrak{A}(\mathbf{s}, t, \mathbf{x}) \left\{ (\mathbf{s}, t, \mathbf{x}) \in \tilde{X} \mid \tilde{F}\_a(\mathbf{s}, t, \mathbf{x}) = \lambda, \tilde{G}\_b(\mathbf{s}, t, \mathbf{x}) = \gamma \right\},$$

we obtain (first for λ = γ = 1, from which the other cases follow):

$$\mathcal{P}(\mathcal{F}\_a = \lambda, \tilde{G}\_b = \gamma) = \int\_X d\mu(\mathbf{x}) \, P(F = \lambda | A = a, \mathbf{x}) \cdot P(G = \gamma | B = b, \mathbf{x}). \tag{6.126}$$

With *Freedom* and *Bell locality*, this yields

$$P(F=\lambda, G=\gamma | A=a, B=b) = \tilde{P}(\tilde{F}\_a = \lambda, \tilde{G}\_b = \gamma),\tag{6.127}$$

as in (6.114), so that the proof may be completed as for Theorem 6.15. -

Let us note that since in Bell's second theorem the settings (*a*,*b*) are treated as free parameters to begin with, the difference between *X* and *Z* evaporates, so that in the above formulae one might as well have replaced (*X*,μ) by the space (*XZ*,μ*Z*) that describes all relevant degrees of freedom *except the settings* (i.e., the experimentalist, in either human or machine form). Either way, Bell's locality condition may be disentangled into the following conditions (introduced by Jarrett and Shimony):

#### 1. *Parameter Independence* (PI):

$$P(\mathcal{\lambda}|a,b,\mathbf{x}) = P(\mathcal{\lambda}|a,\mathbf{x});\tag{6.128}$$

$$P(\gamma|a,b,\mathbf{x}) = P(\gamma|b,\mathbf{x});\tag{6.129}$$

#### 2. *Outcome Independence* (OI):

$$P(\lambda|a,b,\gamma,\mathbf{x}) = P(\lambda|a,b,\mathbf{x});\tag{6.130}$$

$$P(\gamma|a,b,\lambda,\mathbf{x}) = P(\gamma|a,b,\mathbf{x}),\tag{6.131}$$

where we have abbreviated *P*(*F* = λ|*A* = *a*,*B* = *b*, *x*) by *P*(λ|*a*,*b*, *x*), etc., and have used the following notation (which states identities in case one has (6.91) - (6.93)):

#### 6.5 Bell's theorems 219

$$P(\lambda|a,b,\mathbf{x}) \equiv \sum\_{\mathcal{I}} P(\lambda,\mathcal{I}|a,b,\mathbf{x});\tag{6.132}$$

$$P(\boldsymbol{\gamma}|a,b,\boldsymbol{\chi}) \equiv \sum\_{\boldsymbol{\lambda}} P(\boldsymbol{\lambda},\boldsymbol{\gamma}|a,b,\boldsymbol{\chi});\tag{6.133}$$

$$P(\lambda|a,b,\gamma,\mathbf{x}) \equiv \frac{P(\lambda,\gamma|a,b,\mathbf{x})}{P(\gamma|a,b,\mathbf{x})};\tag{6.134}$$

$$P(\boldsymbol{\gamma}|a,b,\lambda,\boldsymbol{x}) \equiv \frac{P(\boldsymbol{\lambda},\boldsymbol{\gamma}|a,b,\boldsymbol{x})}{P(\boldsymbol{\lambda}|a,b,\boldsymbol{x})},\tag{6.135}$$

It is easy to see that *Bell locality is equivalent to the conjunction of* PI *and* OI.

Note that the former (PI), akin to *Locality*, is a hidden or 'subsurface' version of the *no signaling* property of the 'surface' probabilities, which states that

$$P(\lambda|a,b) \equiv \sum\_{\mathcal{Y}} P(\lambda, \mathcal{Y}|a,b)$$

is independent of *b* (and *vice versa*). But a violation of PI only leads to signaling if *x* can be operationally controlled, similar to the way in which experimental physicists prepare quantum states ψ. Hence it is reassuring that quantum mechanics satisfies PI if we see the quantum state ψ as a hidden variable: assuming

$$P(\lambda, \gamma | a, b, \mathbf{x}) = P\_{\Psi0}(F = \lambda, G = \gamma | A = a, B = b),\tag{6.136}$$

as computed in (6.102) - (6.105), PI is valid but OI is not. First, for λ = 0 or λ = 1,

$$P(\lambda|a,b,x) = \sum\_{\gamma=0,1} P\_{\Psi0}(F=\lambda, G=\gamma|a,b) = \frac{1}{2}\cos^2(a-b) + \frac{1}{2}\sin^2(a-b) = \frac{1}{2},\tag{6.137}$$

which is independent of *b*, and likewise *P*(γ|*a*,*b*, *x*) = <sup>1</sup> <sup>2</sup> , independently of *a*. This yields PI, which a similar computation shows to be true for any quantum state. On the other hand, given this result, OI would require

$$P\_{\Psi\_0}(F=\mathbb{A}, G=\gamma|A=a, B=b) = P\_{\Psi\_0}(F=\mathbb{A}|A=a) \cdot P\_{\Psi\_0}(G=\gamma|B=b),$$

which is false, since by (6.102) - (6.105), Alice's and Bob's outcomes are correlated.

Hence Bell locality is violated by quantum mechanics, but this does not imply that "quantum mechanics is nonlocal" (as some say). Bell's is a very specific locality condition invented as a constraint on hidden variable theories. In another important sense, viz. *Einstein locality*, quantum mechanics *is* local, in that observables with spacelike separated localization regions commute (this is the case in quantum field theory, but also in any bipartite experiment of the type considered here, where Alice's operators commute with Bob's just by definition of the tensor product).

On the other hand, deterministic theories, which in the present context are defined as those for which all conditional probabilities like *P*(λ, γ|*a*,*b*, *x*) are either zero or one (in which case one may introduce random variables reproducing these probabilities), violate PI but satisfy OI, at least if they reproduce the Born probabilities (such as Bohmian mechanics). Hence such theories violate Bell locality.

Finally, Bell-type inequalities like (6.117) also give information about quantum mechanics itself, particularly about the degree of entanglement of states. Let *H*<sup>1</sup> and *H*<sup>2</sup> be Hilbert spaces, with tensor product *H*<sup>1</sup> ⊗ *H*2. A unit vector ψ ∈ *H*<sup>1</sup> ⊗ *H*<sup>2</sup> is called *uncorrelated* if it is of the form ψ = ϕ<sup>1</sup> ⊗ϕ2, where ϕ*<sup>k</sup>* ∈ *Hk* are unit vectors, *k* = 1,2, and *correlated* otherwise. Clearly, the vectors (6.34) and (6.101) used in the experiments so far are correlated. The simplest result is then as follows.

Theorem 6.19. *Let a*<sup>1</sup> *and a*<sup>2</sup> *be self-adjoint operators on H*1*, and let b*<sup>1</sup> *and b*<sup>2</sup> *be self-adjoint operators on H*2*, each with spectrum contained in* [−1,1] *(equivalently Xa* ≤ 1*, etc.). Let* ψ *be a unit vector in H*<sup>1</sup> ⊗*H*2*, and define two-point functions*

$$
\langle F\_i G\_j \rangle = \langle \Psi, a\_i \otimes b\_j \Psi \rangle. \tag{6.138}
$$

*If* ψ *is uncorrelated, then the Bell inequality* (6.117) *holds.*

*Proof.* This follows from the factorization property

$$\langle F\_l G\_j \rangle = \langle \mathfrak{q}\_1 \otimes \mathfrak{q}\_2, a\_l \otimes b\_j \mathfrak{q}\_1 \otimes \mathfrak{q}\_2 \rangle = \langle \mathfrak{q}\_1, a\_l \mathfrak{q}\_1 \rangle \cdot \langle \mathfrak{q}\_2, b\_j \mathfrak{q}\_2 \rangle = \langle F\_l \rangle \cdot \langle G\_j \rangle,\tag{6.139}$$

where *Fi* = ϕ1,*ai*ϕ1 and *Gj* = ϕ2,*bj*ϕ2. For either sign, this property yields

$$
\langle \langle F\_2(G\_1 - G\_2) \rangle \rangle = \langle F\_2 \rangle \langle G\_1 \rangle (1 \pm \langle F\_1 \rangle \langle G\_2 \rangle) - \langle F\_2 \rangle \langle G\_2 \rangle (1 \pm \langle F\_1 \rangle \langle G\_1 \rangle). \tag{6.140}
$$

The spectral assumption implies that |*Fi*| ≤ 1 and |*Gj*| ≤ 1, which will be used directly below, as well as its consequence |1± *F*1*G*2| = 1± *FiGj*. Hence

$$\begin{aligned} \left| \langle F\_2(G\_1 - G\_2) \rangle \right| &\le \left| 1 \pm \langle F\_1 \rangle \langle G\_2 \rangle \right| + \left| 1 \pm \langle F\_1 \rangle \langle G\_1 \rangle \right| \\ &= 1 \pm \langle F\_1 \rangle \langle G\_2 \rangle + 1 \pm \langle F\_1 \rangle \langle G\_1 \rangle \\ &= 2 \pm \langle F\_1(G\_1 + G\_2) \rangle. \end{aligned} \tag{6.141}$$

Similarly,

$$|\langle F\_1(G\_1 + G\_2) \rangle| \le 2 \pm \langle F\_2(G\_1 - G\_2) \rangle,\tag{6.142}$$

so that, writing Φ = *F*1*G*1+*F*1*G*2+*F*2*G*1−*F*2*G*2, for either sign ± we have

$$|\Phi| \le |\langle F\_1(G\_1 + G\_2) \rangle| + |\langle F\_2(G\_1 - G\_2) \rangle| \le 4 \pm \Phi \tag{6.143}$$

If Φ ≥ 0 we choose the minus sign, whereas for Φ < 0 we take the plus sign. Either way, we obtain |Φ| ≤ 2, which is the inequality (6.117). -

This result is actually much more general (as hinted at by the way that the proof only uses the uncorrelated vector state ψ = ϕ<sup>1</sup> ⊗ ϕ2). The simplest generalization is to replace pure states by mixed states, where we say that a density operator ρ on *H*<sup>1</sup> ⊗ *H*<sup>2</sup> is *uncorrelated* if it is of the form ρ = ∑*<sup>i</sup> pi*ρ<sup>1</sup> ⊗ ρ2, where the *pi* are probabilities and ρ*<sup>k</sup>* is a density matrix on *Hk*, *k* = 1,2. Then all uncorrelated density matrices satisfy the inequality (6.117). Even more generally, uncorrelated states on C\*-algebras or von Neumann algebras *A*⊗*B* satisfy (6.117), see Notes.

#### 6.6 The Colbeck–Renner Theorem

One may try to strengthen Bell's second theorem by weakening its assumptions. A remarkable result in this direction states that, roughly speaking, any probabilistic hidden variable theory that satisfies *Freedom* and *Parameter Independence* and is compatible with quantum mechanics adds nothing to quantum mechanics. In other words, it appears that quantum mechanics "cannot be extended", or "is complete".

In fact, the result turns out to be more modest than this summary suggests, since the reasoning required to prove the claim hinges on certain assumptions which are satisfied by quantum mechanics itself, but might seem unnatural for a hidden variable theory. In any case, we have to state our notation and assumptions very clearly.

Definition 6.20. *A* hidden variable theory T *underlying quantum mechanics consists of a measurable space* (*X*,Σ) *whose points x label conditional probabilities*

$$P(a\_1 = \lambda\_1, \dots, a\_n = \lambda\_n | \mathbf{x}) \equiv P(\mathbf{a} = \lambda | \mathbf{x})$$

*for the possible outcomes* λ = (λ1,...,λ*n*) *of a measurement of any family* a = (*a*1,...,*an*) *of n commuting self-adjoint operators on any Hilbert space H.*

*These formal conditional probabilities are* a priori *only supposed to satisfy*

$$0 \le P(\mathbf{a} = \hat{\lambda} | \mathbf{x}) \le 1;\tag{6.144}$$

$$\sum\_{\lambda} P(\mathbf{a} = \lambda | \mathbf{x}) = 1.\tag{6.145}$$

*Apart from these probabilities, for each Hilbert space H and any pure state e* ∈ P1(*H*)*, the theory* T *yields a classical state* μ*e, i.e., a probability measure on X.*

As the notation indicates, μ*<sup>e</sup>* depends on *e* only and hence is independent of *a* and λ. From the point of view of T , a quantum state *is* a probability measure on *X*! In what follows we assume for simplicity that *H* is finite-dimensional, so that *e* = *e*<sup>ψ</sup> for some unit vector ψ ∈ *H*. With slight abuse of notation we then write μψ for μ*e*<sup>ψ</sup> .

An important special case will be the bipartite setting *H* = *H*<sup>1</sup> ⊗*H*2, where Alice and Bob measure self-adjoint operators *X* and *Y* on *H*<sup>1</sup> and *H*2, respectively, so that

$$m = 2, \ a\_1 = X \otimes 1\_{H\_2}, \ a\_2 = 1\_{H\_1} \otimes Y.$$

We then introduce settings *c* = (*a*,*b*), as in the previous sections, so that we typically look at expressions like *P*(*Xa* = λ1,*Yb* = λ2|*x*). The other case of interest will simply be *n* = 1 with *a*<sup>1</sup> ≡ *a*, λ<sup>1</sup> ≡ λ; indeed, this will be the case in the statement of the theorem (the bipartite case playing a role only in the proof, though a crucial one!).

The following notation will be quite important to the argument. An equality

$$P\_{\Psi}(\mathbf{a} = \hat{\lambda} | \mathbf{x}) = \mathcal{a}(\mathbf{x}),\tag{6.146}$$

where α : *X* → [0,1] is measurable (often even constant), abbreviates:

*P*(a = λ|*x*) = α(*x*) for almost every *x* with respect to the measure μψ.

That is, there is a subset *X* ⊂ *X* such that μψ(*X* ) = 0 and *P*ψ(a = λ|*x*) = α(*x*) holds for any *x* ∈ *X*\*X* . If *X* is finite, this simply means that the equality holds for any *x* for which μψ({*x*}) > 0. Since this notation may render equalities like

$$P\_{\Psi}(\mathbf{a} = \lambda | \boldsymbol{\alpha}) = P\_{\Phi}(\mathbf{a}' = \lambda' | \boldsymbol{\alpha}),\tag{6.147}$$

ambiguous, we explicitly define (6.147) as the double implication

$$P\_{\Psi}(\mathbf{a} = \lambda | \mathbf{x}) = \alpha(\mathbf{x}) \Leftrightarrow P\_{\Psi}(\mathbf{a}' = \lambda' | \mathbf{x}) = \alpha(\mathbf{x}) .$$

Furthermore, for ε → 0 we write

$$P\_{\boldsymbol{\Psi}}(\mathbf{a} = \boldsymbol{\lambda} | \mathbf{x}) \stackrel{\mathcal{E}}{\approx} P\_{\boldsymbol{\Phi}}(\mathbf{a}' = \boldsymbol{\lambda}' | \mathbf{x}) \Leftrightarrow P\_{\boldsymbol{\Psi}}(\mathbf{a} = \boldsymbol{\lambda} | \mathbf{x}) = P\_{\boldsymbol{\Phi}}(\mathbf{a}' = \boldsymbol{\lambda}' | \mathbf{x}) + O(\sqrt{\varepsilon}), \tag{6.148}$$

as well as

$$
\Psi \stackrel{\varepsilon}{\approx} \mathfrak{q} \Leftrightarrow (1 - \mathfrak{e}) \le |\langle \Psi, \mathfrak{q} \rangle| \le 1. \tag{6.149}
$$

We are now ready to state our assumptions for the Colbeck–Renner Theorem:

• *Compatibility with Quantum Mechanics (CQ):* for any unit vector ψ ∈ *H*,

$$\int\_{X} d\mu\_{\Psi}(\mathbf{x}) \, P(\mathbf{a} = \lambda | \mathbf{x}) = p\_{\Psi}(\mathbf{a} = \lambda), \tag{6.150}$$

where the quantum-mechanical prediction *p*ψ(a = λ) is given by the Born rule

$$p\_{\Psi}(\mathbf{a} = \lambda) = \langle \Psi, e\_{\lambda\_1}^{(1)} \cdots e\_{\lambda\_n}^{(n)} \Psi \rangle,\tag{6.151}$$

cf. (2.21), where *e* (*i*) <sup>λ</sup>*<sup>i</sup>* is the spectral projection on the eigenspace *H*λ*<sup>i</sup>* ⊂ *H* of *ai*. • *Unitary Invariance (UI):* for any unit vector ψ ∈ *H* and unitary *u* on *H*,

$$P\_{\mu\Psi}(\mathbf{a} = \lambda|\mathbf{x}) = P\_{\Psi}(\mu^{-1}\mathbf{a}\mu = \lambda|\mathbf{x}).\tag{6.152}$$

• *Continuity of Probabilities (CP:* If <sup>ψ</sup> <sup>ε</sup> <sup>≈</sup> <sup>ϕ</sup>, then *<sup>P</sup>*ψ(<sup>a</sup> <sup>=</sup> <sup>λ</sup>|*x*) <sup>ε</sup> ≈ *P*ϕ(a = λ|*x*).

In the remaining axioms, *H* = *H*<sup>1</sup> ⊗*H*2, and *a* and *b* are self-adjoint operators on *H*<sup>1</sup> and *H*2, respectively (duly identified with operators *a*⊗1*H*<sup>2</sup> and 1*H*<sup>1</sup> ⊗*b* on *H*).

• *Parameter Independence (PI):*

$$\sum\_{\gamma \in \sigma(b)} P(a = \lambda, b = \gamma | \mathbf{x}) = P(a = \lambda | \mathbf{x});\tag{6.153}$$

$$\sum\_{\lambda \in \sigma(a)} P(a=\lambda, b=\gamma|\mathbf{x}) = P(b=\gamma|\mathbf{x}).\tag{6.154}$$

• *Product Extension (PE):* for any pair of states ψ<sup>1</sup> ∈ *H*1, ψ<sup>2</sup> ∈ *H*2,

$$P\_{\Psi\_1}(a=\mathbb{X}|\mathbf{x}) = P\_{\Psi\_1 \otimes \Psi\_2}(a=\mathbb{X}|\mathbf{x}).\tag{6.155}$$

• *Schmidt Extension (SE):* if υ*<sup>i</sup>* ∈ *H*<sup>1</sup> (*i* = 1,...,dim(*H*)) are eigenstates of *a*, then for arbitrary orthogonal states *ui* <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> and coefficients *ci* <sup>&</sup>gt; 0 with <sup>∑</sup>*<sup>i</sup> <sup>c</sup>*<sup>2</sup> *<sup>i</sup>* = 1,

$$P\_{\sum c\_l \cdot \mathbf{u}\_l}(a=\mathbf{x}|\mathbf{x}) = P\_{\sum c\_l \cdot \mathbf{u}\_l \otimes u\_l}(a=\mathbf{x}|\mathbf{x}).\tag{6.156}$$

Note that PI makes sense, because (6.151) and (6.150) imply that for *p*ψ(a = λ) to be nonzero we must have λ*<sup>i</sup>* ∈ σ(*ai*) for each *i*. All assumptions are satisfied by quantum mechanics itself (seen as a hidden variable theory with ψ as the "hidden" variable *x*). In the context of hidden variable theories, though, one might doubt the plausibility of *UI*, *CP*, and *SE*. But we need all these assumptions to prove:

Theorem 6.21. *If* T *satisfies* CQ*,* UI*,* CP*,* PI*,* PE*, and* SE*, then for any (finitedimensional) Hilbert space H, unit vector* ψ ∈ *H, and operator a* ∈ *B*(*H*)sa*,*

$$P\_{\Psi}(a=\lambda|\mathbf{x}) = p\_{\Psi}(a=\lambda). \tag{6.157}$$

In other words, the hidden variable *x* is even more hidden than expected, since knowing its value has no effect on the probabilities for the outcomes of experiments.

*Proof.* We first assume (without loss of generality) that *a* is nondegenerate as a selfadjoint matrix, in that it has distinct eigenvalues (λ1,...,λdim(*<sup>H</sup>*)); this assumption will be removed at the end of the proof. The proof consists of three steps.

1. The theorem holds for *H* = C<sup>2</sup> and any pair (*a*,ψ) for which

$$p\_{\Psi}(a=\mathbb{X}\_{1}) = p\_{\Psi}(a=\mathbb{X}\_{2}) = 1/2,\tag{6.158}$$

This only requires assumptions *CQ*, *PI*, and *SE*.

2. The theorem holds for *H* = C*<sup>l</sup>* , *l* < ∞ arbitrary, and any pair (*a*,ψ) for which

$$p\_{\Psi}(a=\mathbb{X}\_{l})=\cdots=p\_{\Psi}(a=\mathbb{X}\_{l})=1/l.\tag{6.159}$$

This is just a slight extension of step 1 and uses the same three assumptions.

3. The theorem holds in general. This requires all assumptions (as well as step 2).

*Proof of step 1*. Let *<sup>H</sup>* <sup>=</sup> <sup>C</sup>2, with basis (υ1,υ2) of eigenvectors of *<sup>a</sup>*, so that <sup>ψ</sup> <sup>∈</sup> <sup>C</sup><sup>2</sup> may be written as

$$
\Psi = (\mathfrak{v}\_1 + \mathfrak{v}\_2) / \sqrt{2}. \tag{6.160}
$$

Without loss of generality, we may assume that λ<sup>1</sup> = 1 and λ<sup>2</sup> = −1. We now relabel *a* as *a*<sup>0</sup> and extend it to a family of operators (*ak*)*k*=0,1,...,2*N*−<sup>1</sup> by fixing an integer *N* > 1, putting θ*<sup>k</sup>* = *k*π/2*N*, and defining

$$c\_k = e\_{\theta\_{k+\pi}} - e\_{\theta\_k},\tag{6.161}$$

where, for any angle θ ∈ [0,2π], the operator *e*<sup>θ</sup> = |θθ| is the orthogonal projection onto the one-dimensional subspace spanned by the unit vector

$$|\boldsymbol{\theta}\rangle = \sin(\theta/2) \cdot \boldsymbol{\upsilon}\_1 + \cos(\theta/2) \cdot \boldsymbol{\upsilon}\_2. \tag{6.162}$$

In the bipartite setting, we have operators *ak* <sup>=</sup> *ck* <sup>⊗</sup>12 and *bk* <sup>=</sup> 12 <sup>⊗</sup>*ck* on <sup>C</sup><sup>2</sup> <sup>⊗</sup>C2, as well as a maximally correlated (Bell) state <sup>ψ</sup>*AB* <sup>∈</sup> <sup>C</sup><sup>2</sup> <sup>⊗</sup>C2, given by

$$
\psi\_{AB} = \frac{1}{\sqrt{2}} (\mathfrak{v}\_1 \otimes \mathfrak{v}\_1 + \mathfrak{v}\_2 \otimes \mathfrak{v}\_2). \tag{6.163}
$$

Using assumptions *PI* and *SE*, we then have, for *i* = 1,2 λ<sup>1</sup> = 1, and λ<sup>2</sup> = −1,

$$P\_{\Psi}(a=\lambda\_{i}|\mathbf{x}) = P\_{\Psi\_{AB}}(a\_{0}=\lambda\_{i}|\mathbf{x}).\tag{6.164}$$

The quantum-mechanical prediction is

$$p\_{\Psi\_{\text{AB}}}(a\_0 = 1) = p\_{\Psi\_{\text{AB}}}(a\_0 = -1) = \frac{1}{2}.\tag{6.165}$$

Our goal is to show that also for each *x* ∈ *X*, knowing *x* is irrelevant in that

$$P\_{\Psi\_{\text{AB}}}(a\_0 = 1|\mathbf{x}) = P\_{\Psi\_{\text{AB}}}(a\_0 = -1|\mathbf{x}) = \frac{1}{2}.\tag{6.166}$$

To this effect we introduce the combination of probabilities

$$I^{(N)}(\mathbf{x}) = P(a\_0 = b\_{2N-1}|\mathbf{x}) + \sum\_{k \in K\_N, l \in L\_N, |k-l|=1} P(a\_k \neq b\_l|\mathbf{x}),\tag{6.167}$$

where *KN* = {0,2,...,2*N* −2} and *LN* = {1,3,...,2*N* −1}. Our first inequality is

$$\begin{aligned} |P(a\_k = \lambda\_i | \mathbf{x}) - P(b\_l = \lambda\_i | \mathbf{x})| &= |P(a\_k = \lambda\_i, b\_l = \lambda\_i | \mathbf{x}) + P(a\_k = \lambda\_i, b\_l \neq \lambda\_i | \mathbf{x})| \\ &- P(a\_k = \lambda\_i, b\_l = \lambda\_i | \mathbf{x}) + P(a\_k \neq \lambda\_i, b\_l = \lambda\_i | \mathbf{x})| \\ &= |P(a\_k = \lambda\_i, b\_l \neq \lambda\_i | \mathbf{x}) - P(a\_k \neq \lambda\_i, b\_l = \lambda\_i | \mathbf{x})| \\ &\leq P(a\_k = \lambda\_i, b\_l \neq \lambda\_i | \mathbf{x}) + P(a\_k \neq \lambda\_i, b\_l = \lambda\_i | \mathbf{x}) \\ &= P(a\_k \neq b\_l | \mathbf{x}), \end{aligned} \tag{6.168}$$

where *i* = 1,2, and we used *PI*. This implies a second inequality: since *a*2*<sup>N</sup>* = −*a*0,

$$\begin{aligned} |P(a\_0 = 1 | \mathbf{x}) - P(a\_0 = -1 | \mathbf{x})| &= |P(a\_0 = 1 | \mathbf{x}) - P(a\_{2N} = 1 | \mathbf{x})| \\ &\le \sum\_{k, l, |k-l|=1} |P(a\_k = 1 | \mathbf{x}) - P(b\_l = 1 | \mathbf{x})| \\ &\le \sum\_{k, l, |k-l|=1} P(a\_k \neq b\_l | \mathbf{x}) \le I^{(N)}(\mathbf{x}). \end{aligned}$$

Integrating this with respect to the measure μψ*AB* and using *CQ* gives

$$\int\_{X} d\mu\_{\Psi \text{AB}}(\mathbf{x}) \left| P(a\_0 = 1 | \mathbf{x}) - P(a\_0 = -1 | \mathbf{x}) \right| \le \int\_{X} d\mu\_{\Psi \text{AB}}(\mathbf{x}) \, I^{(N)}(\mathbf{x}) = I^{(N)}\_{\Psi \text{AB}}. \tag{6.169}$$

We wish to invoke the corresponding quantum-mechanical expression, defined by

6.6 The Colbeck–Renner Theorem 225

$$I\_{\Psi\_{\rm AB}}^{(N)} = p\_{\Psi\_{\rm AB}}(a\_0 = b\_{2N-1}) + \sum\_{k \in K\_N, l \in L\_N, |k-l|=1} p\_{\Psi\_{\rm AB}}(a\_k \neq b\_l). \tag{6.170}$$

A straightforward calculation shows that this expression is equal to

$$I\_{\Psi\_{4B}}^{(N)} = 2N \sin^2(\pi/4N). \tag{6.171}$$

Since lim*N*→<sup>∞</sup> *I* (*N*) <sup>ψ</sup>*AB* = 0, letting *N* → ∞ in (6.169) therefore yields (6.166). From (6.164) we then obtain (6.158).

*Proof of step 2*. Let *H* = C*<sup>l</sup>* and let (υ*i*)*<sup>l</sup> <sup>i</sup>*=<sup>1</sup> be an orthonormal basis of eigenvectors of *a*, with corresponding eigenvalues λ*i*, and phase factors for the eigenvectors υ*<sup>i</sup>* such that *ci* > 0 (and of course, ∑*<sup>i</sup> c*<sup>2</sup> *<sup>i</sup>* = 1) in the expansion

$$
\Psi \Psi = \sum\_{i} c\_{i} \mathcal{U}\_{i}.\tag{6.172}
$$

The case of interest will be *c*<sup>1</sup> = ··· = *cl* = 1/*l*, but first we merely assume that *c*<sup>1</sup> = *c*<sup>2</sup> (the same reasoning applies to any other pair), with λ<sup>1</sup> = 1 and λ<sup>2</sup> = −1 (which involves no loss of generality either and just simplifies the notation). The other positive coefficients *ci* are arbitrary. Generalizing (6.166), we will show that

$$P\_{\Psi}(a=1|\mathbf{x}) = P\_{\Psi}(a=-1|\mathbf{x}).\tag{6.173}$$

This shows that if two Born probabilities defined by some quantum state *e*<sup>ψ</sup> are equal, then the underlying hidden variable probabilities must be equal μψ-a.e., too. Eq. (6.159) immediately follows from this result by taking all *ci* to be equal.

As in step 1, we pass to the bipartite setting, introducing two copies of *H* = C*<sup>l</sup>* denoted by *HA* = *HB* = C*<sup>l</sup>* , and define the correlated state

$$\Psi\_{AB} = \sum\_{i} c\_{i} \cdot \mathfrak{v}\_{i} \otimes \mathfrak{v}\_{i} \tag{6.174}$$

in *HA* ⊗ *HB*. Eq. (6.164) again follows from assumptions *PI* and *SE*. Throughout the argument of step 1, we now replace each probability *P*(*ak* = λ*i*,*bl* = γ *<sup>j</sup>*|*x*) by an adapted probability *P*(1) (*ak* = λ*i*,*bl* = γ *<sup>j</sup>*|*x*), defined as the conditional probability

$$\begin{split}P^{(1)}(a\_k = \lambda\_l, b\_l = \mathfrak{P}|\mathbf{x}) &= P(a\_k = \lambda\_l, b\_l = \mathfrak{P}\_2 ||\lambda\_l| = |\mathfrak{P}\_2| = 1, \mathbf{x}) \\ &= \frac{P(a\_k = \lambda\_l, b\_l = \mathfrak{P}\_2, |\lambda\_l| = |\mathfrak{P}\_2| = 1|\mathbf{x})}{P(|\lambda\_l| = |\mathfrak{P}\_2| = 1|\mathbf{x})}, \quad (6.175) \end{split} \tag{6.175}$$

for all *x* for which *P*(|λ*i*| = |γ2| = 1|*x*) > 0, whereas

$$P^{(1)}(a\_k = \mathbb{A}, b\_l = \mathfrak{P}|\mathbf{x}) = \mathbf{0} \tag{6.176}$$

whenever *P*(|λ*i*| = |γ2| = 1|*x*) = 0. The same argument then yields (6.169), with *P* replaced by *P*(1) but with the same right-hand side. As in step 1,

$$P\_{\Psi\_{\text{AB}}}^{(1)}(a\_0 = 1|\mathbf{x}) = P\_{\Psi\_{\text{AB}}}^{(1)}(a\_0 = -1|\mathbf{x}),\tag{6.177}$$

which implies that

$$P\_{\Psi\_{\rm AB}}(a\_0 = 1 | \mathbf{x}) = P\_{\Psi\_{\rm AB}}(a\_0 = -1 | \mathbf{x}),\tag{6.178}$$

either because both sides vanish (if *P*(|λ*i*| = |γ2| = 1|*x*) = 0), or because (in the opposite case) the denominator *P*(|λ*i*| = |γ2| = 1|*x*) cancels from both sides of (6.177).

Combined with (6.164), eq. (6.178) proves (6.173) and hence establishes step 2.

*Proof of step 3*. This is the most difficult step in the proof, relying on a technique wittily called *embezzlement* (which we only need for maximally entangled states). We will deal with three Hilbert spaces, namely *H* = C*<sup>l</sup>* , *H* = C*m*, and *H* = C*<sup>n</sup>* (where *n* = *m<sup>N</sup>* for some large *N*, see below), each with some fixed orthonormal basis (υ*i*)*<sup>l</sup> <sup>i</sup>*=1, (υ *j* )*m <sup>j</sup>*=1, and (υ *k* )*n <sup>k</sup>*=1, respectively. Given a further number *mi* ≤ *m*, we now list the *nm* basis vectors υ *<sup>k</sup>* ⊗υ *<sup>j</sup>* of *H* ⊗*H* in two different orders:

$$\begin{split} 1. \quad & \mathfrak{v}\_{1}^{\prime\prime} \otimes \mathfrak{v}\_{1}^{\prime}, \ldots, \mathfrak{v}\_{n}^{\prime\prime} \otimes \mathfrak{v}\_{1}^{\prime}, \mathfrak{v}\_{1}^{\prime\prime} \otimes \mathfrak{v}\_{2}^{\prime}, \ldots, \mathfrak{v}\_{n}^{\prime\prime} \otimes \mathfrak{v}\_{2}^{\prime}, \ldots, \mathfrak{v}\_{1}^{\prime\prime} \otimes \mathfrak{v}\_{m}^{\prime}, \ldots, \mathfrak{v}\_{n}^{\prime\prime} \otimes \mathfrak{v}\_{m}^{\prime}; \\ 2. \quad & \mathfrak{v}\_{1}^{\prime\prime} \otimes \mathfrak{v}\_{1}^{\prime}, \ldots, \mathfrak{v}\_{1}^{\prime\prime} \otimes \mathfrak{v}\_{m\_{l}}^{\prime}, \mathfrak{v}\_{2}^{\prime\prime} \otimes \mathfrak{v}\_{1}^{\prime}, \ldots, \mathfrak{v}\_{2}^{\prime\prime} \otimes \mathfrak{v}\_{m\_{l}}^{\prime}, \ldots, \mathfrak{v}\_{n}^{\prime\prime} \otimes \mathfrak{v}\_{1}^{\prime}, \ldots, \mathfrak{v}\_{n}^{\prime\prime} \otimes \mathfrak{v}\_{m\_{l}}^{\prime}, \ldots, \mathfrak{v}\_{n}^{\prime\prime} \end{split}$$

where the remaining vectors (i.e., those of the form υ *<sup>k</sup>* ⊗υ *<sup>j</sup>* for 1 ≤ *k* ≤ *n* and *j* > *mi*) are listed in some arbitrary order.

Define

$$
\mu^{(m\_l)}: H'' \otimes H' \to H'' \otimes H' \tag{6.179}
$$

as the unitary operator that maps the first list on the second. We will need the explicit expression

$$
\mu^{(m\_l)}(\mathfrak{v}\_k'' \otimes \mathfrak{v}\_1') = \mathfrak{v}\_{s\_k'}'' \otimes \mathfrak{v}\_{j\_k'}',\tag{6.180}
$$

where for given *k* = 1,...,*n* the numbers *s<sup>i</sup> <sup>k</sup>* = 1,...,*ni* (where *ni* is the smallest integer such that *nimi* ≥ *n*) and *j i <sup>k</sup>* = 1,...,*ni* are uniquely determined by

$$k = (s\_k^i - 1)m\_i + j\_k^i. \tag{6.181}$$

We will actually work with two copies of *H* ⊗ *H* , called *H <sup>A</sup>* ⊗ *H <sup>A</sup>* and *H <sup>B</sup>* ⊗ *H B*, with ensuing copies of *u* (*mi*) *<sup>A</sup>* and *u* (*mi*) *<sup>B</sup>* of *u*(*mi*) , and hence, leaving the isomorphism

$$H\_A^{\prime\prime} \otimes H\_A^{\prime} \otimes H\_B^{\prime\prime} \otimes H\_B^{\prime} \cong H\_A^{\prime\prime} \otimes H\_B^{\prime\prime} \otimes H\_A^{\prime} \otimes H\_B^{\prime} \tag{6.182}$$

implicit, we obtain a unitary operator

$$
\mu\_A^{(m\_l)} \otimes \mu\_B^{(m\_l)} : H\_A^{\prime \prime} \otimes H\_B^{\prime \prime} \otimes H\_A^{\prime} \otimes H\_B^{\prime} \to H\_A^{\prime \prime} \otimes H\_B^{\prime \prime} \otimes H\_A^{\prime} \otimes H\_B^{\prime}.\tag{6.183}
$$

The point of all this is that the unit vector

$$\mathfrak{K}\_{\mathfrak{n}} \in H\_{A}^{\prime\prime} \otimes H\_{A}^{\prime\prime};\tag{6.184}$$

$$\mathbf{x}\_{\text{tt}} = \frac{1}{\sqrt{\mathbf{C}(n)}} \sum\_{k=1}^{n} \mathbf{v}\_{k}^{\prime\prime} \otimes \mathbf{v}\_{k}^{\prime\prime},\tag{6.185}$$

where *C*(*n*) = ∑*<sup>n</sup> <sup>k</sup>*=<sup>1</sup> 1/*k*, acts as a "catalyst" in producing the maximally entangled state

$$
\mathfrak{sp} \in H\_A^\prime \otimes H\_B^\prime;\tag{6.186}
$$

$$\boldsymbol{\varphi} = \frac{1}{\sqrt{m\_i}} \sum\_{j=1}^{m\_i} \boldsymbol{\upsilon}'\_j \otimes \boldsymbol{\upsilon}'\_j,\tag{6.187}$$

from the uncorrelated state υ <sup>1</sup> ⊗υ <sup>1</sup> ∈ *H <sup>A</sup>* ⊗*H <sup>B</sup>*, in that for any *mi* ≤ *m*,

$$
\mu\_A^{(m\_l)} \otimes \mu\_B^{(m\_l)} (\kappa\_n \otimes \mathfrak{v}\_1' \otimes \mathfrak{v}\_1') \stackrel{\mathfrak{e}/2}{\approx} \kappa\_n \otimes \mathfrak{q}.\tag{6.188}
$$

Here ε = 1/*N* if *n* = *m*2*N*. This follows straightforwardly from (6.183) - (6.187).

After this preparation we are ready for the proof of step 3, continuing to use the notation established at the beginning of step 2, especially (6.172). As in step 1, we introduce two copies *HA* = *HB* = C*<sup>l</sup>* of *H*, as well as two states

$$\Psi\_{AB} = \sum\_{i} c\_{i} \cdot \mathfrak{v}\_{i} \otimes \mathfrak{v}\_{i} \in H\_{A} \otimes H\_{B};\tag{6.189}$$

$$
\Psi\_{AB}^{\prime\prime\prime} = \kappa\_n \otimes \mathfrak{v}\_1^{\prime} \otimes \mathfrak{v}\_1^{\prime} \otimes \mathfrak{v}\_{AB} \in H\_A^{\prime\prime\prime} \otimes H\_B^{\prime\prime},\tag{6.190}
$$

where κ*<sup>n</sup>* is given by (6.185), we put

$$H'' = H'' \otimes H' \otimes H,\tag{6.191}$$

and in our notation we have ignored the obvious permutations of factors in the tensor product. For any ε > 0, pick *c <sup>i</sup>* <sup>∈</sup> <sup>R</sup><sup>+</sup> such that (*<sup>c</sup> i* )<sup>2</sup> <sup>∈</sup> <sup>Q</sup><sup>+</sup> and

$$|c\_i' - c\_i| < \varepsilon/\dim(H),\tag{6.192}$$

which implies that, in the sense of (6.149), we have

$$\sum\_{i} c\_{i}^{\prime} \boldsymbol{\upsilon}\_{i} \stackrel{\varepsilon/2}{\approx} \sum\_{i} c\_{i} \boldsymbol{\upsilon}\_{i}. \tag{6.193}$$

Suppose

$$c'\_i = \sqrt{p\_i/q\_i},\tag{6.194}$$

with *pi*,*qi* ∈ N and gcd(*pi*,*qi*) = 1, and define

$$m\_i = p\_i \prod\_{l' \neq l} q\_{l'}.\tag{6.195}$$

Consequently, writing

$$q = 1/\sqrt{\sum\_{i'} m\_{i'}},\tag{6.196}$$

the following quotient is independent of *i*:

228 6 Classical models of quantum mechanics

$$\frac{c\_i'}{\sqrt{m\_i}} = q.\tag{6.197}$$

Given the integers *mi* thus obtained, we define a unitary operator

$$
u : H^{\prime \prime \prime} \to H^{\prime \prime}; \tag{6.198}$$

$$\mu = \sum\_{i=1}^{l} \mu^{(m\_i)} \otimes |\mathfrak{v}\_l\rangle\langle\mathfrak{v}\_l|,\tag{6.199}$$

where *u*(*mi*) is defined in (6.180). From this definition, with additional labels to denote the copies *uA* : *H <sup>A</sup>* → *H <sup>A</sup>* and *uB* : *H <sup>B</sup>* → *H <sup>B</sup>* , and (6.188), and writing

$$\xi^{ij} = \mathfrak{v}\_l \otimes \mathfrak{v}'\_j \in H \otimes H',\tag{6.200}$$

with corresponding copies

$$
\tilde{\xi}\_{AA'}^{ij\_l} \in H\_A \otimes H\_A';\tag{6.201}
$$

$$
\tilde{\xi}\_{BB'}^{ij} \in H\_B \otimes H\_B',\tag{6.202}
$$

we then obtain the important relations

$$1\_{H\_A^{\prime\prime\prime}} \otimes 1\_{H\_B^{\prime\prime\prime}}(\Psi\_{AB}^{\prime\prime\prime}) = \kappa\_n \otimes \sum\_{l=1}^l c\_l \cdot \xi\_{AA^\prime}^{\prime l1} \otimes \xi\_{BB^\prime}^{\prime l1};\tag{6.203}$$

$$\mu\_A \otimes 1\_{H\_B^{\prime\prime\prime}}(\boldsymbol{\upmu}\_{AB}^{\prime\prime\prime}) = \frac{1}{\sqrt{\mathbb{C}(n)}} \sum\_{i=1}^l \sum\_{k=1}^n \frac{c\_i}{\sqrt{k}} \cdot \boldsymbol{\upmu}\_{s\_k}^{\prime\prime} \otimes \boldsymbol{\upmu}\_k^{\prime\prime} \otimes \boldsymbol{\upxi}\_{AA'}^{iJ\_k^{\parallel}} \otimes \boldsymbol{\upxi}\_{BB'}^{i1};\quad(6.204)$$

$$1\_{H\_A^{\prime\prime\prime}} \otimes \boldsymbol{\upmu}\_B(\boldsymbol{\upmu}\_{AB}^{\prime\prime\prime}) = \frac{1}{\sqrt{\mathbf{C}(n)}} \sum\_{i=1}^l \sum\_{k=1}^n \frac{c\_i}{\sqrt{k}} \cdot \boldsymbol{\upmu}\_k^{\prime\prime} \otimes \boldsymbol{\upmu}\_{s\_k}^{\prime\prime} \otimes \boldsymbol{\upxi}\_{AA'}^{i1} \otimes \boldsymbol{\upxi}\_{BB'}^{ij\_k^\dagger};\quad(6.205)$$

$$
\mu\_A \otimes \mu\_B(\boldsymbol{\psi}\_{AB}^{\prime\prime\prime}) \stackrel{\mathcal{E}}{\approx} q \cdot \kappa\_n \otimes \sum\_{l=1}^l \sum\_{j\_l=1}^{m\_l} \underline{\mathfrak{E}}\_{AA'}^{ij\_l} \otimes \underline{\mathfrak{E}}\_{BB'}^{ij\_l}.\tag{6.206}
$$

Here the right-hand sides of (6.203) - (6.206) have been arranged so as to obtain vectors in the six-fold tensor product

$$H\_A'' \otimes H\_B'' \otimes H\_A \otimes H\_A' \otimes H\_B \otimes H\_B' \dots$$

We will repeatedly invoke the following lemma, whose proof just unfolds the notation (on the appropriate identification of *a* with *a*⊗1*H*<sup>2</sup> and of *b* with 1*H*<sup>1</sup> ⊗*b*).

Lemma 6.22. *Assume* PI *and* UI*. For any pair of unitary operators u*<sup>1</sup> *on H*<sup>1</sup> *and u*<sup>2</sup> *on H*2*, and any unit vector* ψ ∈ *H*<sup>1</sup> ⊗*H*2*, one has*

$$P\_{(\mu\_1 \otimes 1\_{H\_2})\Psi}(b=\mathfrak{Y}|\mathbf{x}) = P\_{\Psi}(b=\mathfrak{Y}|\mathbf{x});\tag{6.207}$$

$$P\_{(1\_{H\_1}\otimes u\_2)\Psi}(a=\lambda|\mathbf{x}) = P\_{\Psi}(\lambda=\mathbf{x}|\mathbf{x}).\tag{6.208}$$

Since we assume that *a* is nondegenerate, there is a bijective correspondence between its eigenvalues *a* = λ*<sup>i</sup>* and its eigenvectors υ*i*. Instead of *P*(*a* = λ*i*) dressed with whatever parameters *x* or ψ, we may then write *P*(υ*i*), where *a* is understood, and analogously for the more complicated operators on tensor products of Hilbert space appearing below. Repeatedly using Lemma 6.22, we proceed as follows.

• From Step 2, using the notation explained below (6.172),

$$P\_{q \cdot \sum\_{l=1}^{l} \sum\_{j\_l=1}^{m\_l} \xi\_{BB'}^{ij\_l} / \xi\_{BB'}^{ij} | \mathbf{x} \rangle = q^2. \tag{6.209}$$

• From (6.156) in *PE* and (6.209),

$$P\_{q\cdot\sum\_{l,j\_l}\xi^{ij\_l}\_{AA'}\odot\xi^{ij\_l}\_{BB'}}(\xi^{ij}\_{BB'}|\mathbf{x}) = q^2. \tag{6.210}$$

• From (6.155) in *SE* and (6.210),

$$P\_{q \cdot \mathbf{x}\_n \odot \sum\_{l,j\_l} \xi\_{\mathbf{A} \mathbf{d}'}^{ij\_l} \odot \xi\_{\mathbf{B} \mathbf{d}'}^{ij\_l}}(\xi\_{\mathbf{B} \mathbf{d}'}^{ij}|\mathbf{x}) = q^2. \tag{6.211}$$

• From (6.211), *CP* (whose notation we use), and (6.206),

$$P\_{(
u\_A \otimes 
u\_B)} \Psi\_{AB}^{\prime\prime} (\xi\_{BB'}^{ij} | \mathbf{x}) \stackrel{\mathfrak{E}}{\approx} q^2. \tag{6.212}$$

• Recall the number *m* (satisfying *m* ≥ *mi* for all *i*). From (6.212) and Lemma 6.22,

$$P\_{(1\_{H\_A^{\prime\prime\otimes u\_B}})\Psi\_{AB}^{\prime\prime}}(\boldsymbol{\xi}\_{BB'}^{ij\_l}|\mathbf{x}) \stackrel{\mathcal{\mathcal{E}}}{\approx} q^2 \ (j\_l = 1, \dots, m\_l);$$

$$P\_{(1\_{H\_A^{\prime\prime\otimes u\_B}})\Psi\_{AB}^{\prime\prime}}(\boldsymbol{\xi}\_{BB'}^{ij\_l}|\mathbf{x}) \stackrel{\mathcal{\mathcal{E}}}{\approx} \mathbf{0} \ (j\_l = m\_l + 1, \dots, m). \tag{6.213}$$

We now start a different line of argument, to be combined with (6.213) in due course.

• From *PE*, *SE*, and (6.172), with <sup>υ</sup>*<sup>i</sup> <sup>A</sup>* ∈ *HA* denoting υ*<sup>i</sup>* ∈ *H*, we have

$$P\_{\Psi}(a = \lambda\_{i}|\mathbf{x}) \equiv P\_{\Psi}(\mathfrak{v}\_{i}|\mathbf{x}) = P\_{\mathfrak{x}\_{\mathbf{i}} \otimes \sum\_{l} c\_{l} \cdot \mathfrak{z}\_{\mathbf{i}A'}^{l\mathbf{l}} \otimes \mathfrak{z}\_{\mathbf{i}B'}^{l\mathbf{l}}}(\mathfrak{v}\_{A}^{l}|\mathbf{x}). \tag{6.214}$$

• Using Lemma 6.22, (6.203), and (6.204),

$$P\_{\mathbf{x}\_{\mathbf{t}}\odot\sum\_{l}c\_{l}\cdot\xi^{l\mathbf{1}}\_{\mathbf{A}\mathbf{A}'}\odot\xi^{l\mathbf{1}}\_{\mathbf{B}\mathbf{B}'}}(\mathfrak{v}^{l}\_{\mathbf{A}}|\mathbf{x}) = P\_{(\mathbf{1}\_{H^{\mathrm{yy}}\_{\mathbf{A}}\odot\mathbf{u}\mathbf{B})}\mathfrak{v}^{\prime\mathbf{w}}\_{\mathbf{AB}}}(\mathfrak{v}^{l}\_{\mathbf{A}}|\mathbf{x}),\tag{6.215}$$

and hence

$$P\_{\Psi}(a = \lambda\_i | \mathbf{x}) = P\_{(1\_{H\_A^{\prime\prime\varepsilon} \odot u\_B}) \Psi\_{AB}^{\prime\prime}}(\mathfrak{v}\_A^i | \mathbf{x}). \tag{6.216}$$

• From quantum mechanics, notably (6.151), and (6.205), for any *i* = *i* we have

$$P(1\_{H\_A^{\prime\prime\prime\odot u\_B}})\_{\Psi\_{AB}^{\prime\prime\prime}}(\mathfrak{v}\_A^{\prime}\odot\mathfrak{f}\_{BB'}^{ij\_l}) = 0.\tag{6.217}$$

• From *CQ* and (6.217), for any *i* = *i*, 230 6 Classical models of quantum mechanics

$$P\_{(1\_{H\_A^{\prime\prime\prime}\odot u\_B})\Psi\_{AB}^{\prime\prime}}(\mathfrak{v}\_A^{i'}, \mathfrak{f}\_{BB'}^{i j\_l}|\mathbf{x}) = \mathbf{0}.\tag{6.218}$$

• From *PI*,

$$P(\mathfrak{v}\_A^{\prime}|\mathbf{x}) = \sum\_{i,j\_l} P(\mathfrak{v}\_A^{\prime}, \mathfrak{f}\_{BB^{\prime}}^{ij\_l}|\mathbf{x});\tag{6.219}$$

$$P(\boldsymbol{\xi}\_{BB'}^{ij}|\mathbf{x}) = \sum\_{i'} P(\boldsymbol{\upsilon}\_A^{i'}, \boldsymbol{\tilde{\xi}}\_{BB'}^{ij}|\mathbf{x}).\tag{6.220}$$

• From (6.218), (6.219), and (6.220),

$$P\_{(1\_{H\_A^{\prime\prime\otimes}\otimes u\_B})\Psi\_{AB}^{\prime\prime}}(\mathfrak{v}\_A^i|\mathbf{x}) = \sum\_{j\_l} P\_{(1\_{H\_A^{\prime\prime\otimes}\otimes u\_B})\Psi\_{AB}^{\prime\prime\prime}}(\mathfrak{E}\_{BB^\prime}^{ij\_l}|\mathbf{x}).\tag{6.221}$$

Finally, from (6.214), (6.221), (6.213), and (6.197) we obtain

$$P\_{\Psi}(a=\lambda|\mathbf{x}) \stackrel{\mathcal{E}}{\approx} \sum\_{j\_l}^{m\_l} q^2 = m\_l \cdot q^2 = c\_l^2. \tag{6.222}$$

Since *ci* > 0 we have *c*<sup>2</sup> *<sup>i</sup>* = |*ci*| 2; using (6.192) and letting <sup>ε</sup> <sup>→</sup> 0 then proves step 3:

$$P\_{\Psi}(a=\lambda\_{i}|\mathbf{x}) = |c\_{i}|^{2} = p\_{\Psi}(a=\lambda\_{i}).\tag{6.223}$$

Finally, we remove our standing assumption that the spectrum of *a* be nondegenerate. In the degenerate case one has

$$p\_{\Psi}(a=\lambda\_{i}) = \sum\_{j\_{i}} p\_{\Psi}(\mathcal{v}\_{j\_{i}}),\tag{6.224}$$

where the sum is over any orthonormal basis (υ*ji* )*ji* of the eigenspace of λ*i*. Similarly, since each vector υ*ji* gives *a* = λ*i*, probability theory gives for all *x*,

$$P(a = \lambda\_i | \mathbf{x}) = \sum\_{j\_l} P(\mathbf{v}\_{j\_l} | \mathbf{x}). \tag{6.225}$$

The nondegenerate case of the theorem (which distinguishes the states υ*ji* ) yields

$$P\_{\Psi}(\mathfrak{v}\_{\hat{\mathbb{H}}}|\mathfrak{x}) = p\_{\Psi}(\mathfrak{v}\_{\hat{\mathbb{H}}}),\tag{6.226}$$

from which (6.157) follows once again:

$$P\_{\Psi}(a=\lambda\_i|\mathbf{x}) = \sum\_{j\_l} P\_{\Psi}(\mathfrak{v}\_{j\_l}|\mathbf{x}) = \sum\_{j\_l} p\_{\Psi}(\mathfrak{v}\_{j\_l}) = p\_{\Psi}(a=\lambda\_i).$$

Our proof of the Colbeck–Renner Theorem is now complete. -

Under less stringent assumptions this theorem might have been regarded as the conclusion of von Neumann's program to disprove the possibility of completing quantum mechanics by adding hidden variables, but as yet this seems unwarranted.

#### Notes

## §6.1. From von Neumann to Kochen–Specker

'For decades nobody spoke up against von Neumann's arguments, and his conclusions were quoted by some as the gospel'. (Belinfante, 1973, pp. 24)

Theorem 6.2 is due to non Neumann (1932, §IV.2); it was the first result to impose useful constraints on hidden variable theories, anticipating all later literature on the subject. Unfortunately (as part of their general anti-Copenhagen rhetoric), Bell and his followers left the realm of decent academic discourse by calling von Neumann's arguments against hidden variables 'silly' and 'foolish', through which they merely displayed the depth of their own misunderstanding of von Neumann's reasoning; see Caruana (1995), Bub (2011a), and especially Dieks (2016b). In fact, von Neumann (1932, p. 172) carefully qualifies his Theorem 6.2 by stating that it follows '*im Rahmen unserer Bedingungen*' (i.e. '*given our assumptions*'), of which he earlier (on p. 164) admits that linearity is physically reasonable only for *commuting* operators, but nonetheless justifies this assumption through an ensemble argument (now outdated, but by no means 'silly'). Though couched in agreeable academic parlance, the earlier critique by Hermann (1935) was misguided, too (Dieks, 2016b).

The Kochen–Specker Theorem is due to Kochen & Specker (1967); the authors were originally logicians. A similar but less precise statement had appeared earlier in Bell (1966), who was not cited by Kochen and Specker; some authors refer to the *Bell–Kochen–Specker Theorem*. The *Nature* assumption has been experimentally verified, cf. Huang et al (2003). The proof of the fundamental Lemma 6.7 we present is essentially due to Kochen and Specker, as simplified by Peres (1995). Our independent proof for C<sup>4</sup> is taken from Cabello et al (1996). Surveys of various proofs are given by Brown (1992) and Gould (2009); see also Waegell & Aravind (2012) and references therein, as well as Bub (1997) for another proof. From the Netherlands, we cannot fail to mention the short proof by Gill & Keane (1996). For geometric aspects (and even a link with M.C. Escher) see Zimba & Penrose (1993).

One finds two opposite directions of research around the Kochen–Specker Theorem. A computational one, which seems hardly relevant to conceptual issues in physics (the goal rather being *The Guinness Book of Records*), consists of attempts to find a *minimal* set of vectors that *cannot* be coloured. See, for example, Pavicic et al (2005) for arbitrary dimension and Arends (2009) and Uijlen & Westerbaan (2015) for R3, the latter paper showing that at least 22 vectors are needed.

The other, which is of significant conceptual importance and hence is worth some more extensive discussion, consists of attempts to find a *maximal* set of vectors that *can* be coloured. That is, one looks for large (preferably dense and measurable) subsets *S*<sup>2</sup> *<sup>c</sup>* of *S*<sup>2</sup> for which there exists a function *V*˜ : *S*<sup>2</sup> *<sup>c</sup>* → {0,1} that satisfies:


The first result in this direction was obtained by Meyer (1999) and Havlicek et al (2001), who showed that one may take *S*<sup>2</sup> *<sup>c</sup>* <sup>=</sup> *<sup>S</sup>*<sup>2</sup> <sup>∩</sup>Q3; this choice was motivated by invoking finite precision arguments to circumvent the Kochen–Specker Theorem, see below. To write down a suitable function *<sup>V</sup>*˜ : *<sup>S</sup>*<sup>2</sup> <sup>∩</sup> <sup>Q</sup><sup>3</sup> → {0,1}, we first define an auxiliary function *<sup>S</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>∩</sup>Q<sup>3</sup> <sup>→</sup> <sup>Z</sup> by

$$S\left(\frac{n\_1}{m\_1}, \frac{n\_2}{m\_2}, \frac{n\_3}{m\_3}\right) = \frac{n\_3}{m\_3} \cdot \frac{\text{lcm}(m\_1, m\_2, m\_3)}{\text{gcd}(n\_1, n\_2, n\_3)},\tag{6.227}$$

where lcm is the *least common multiple* and gcd is the *greatest common divisor* of the argument. This function is obviously well defined. Then the following works:

$$\tilde{V}(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \mathbf{0} \text{ if } \mathbf{S}(\mathbf{x}, \mathbf{y}, \mathbf{z}) \text{ is odd};\tag{6.228}$$

$$\tilde{V}(\mathbf{x}, \mathbf{y}, \mathbf{z}) = 1 \text{ if } S(\mathbf{x}, \mathbf{y}, \mathbf{z}) \text{ is even.} \tag{6.229}$$

More generally, for an arbitrary *n*-dimensional) Hilbert space *H*, with *n* < ∞, Clifton & Kent (2000) proved the existence of a countable dense colorable subset P1(*H*)*<sup>c</sup>* of P1(*H*) (cf. Definition 6.9), with the additional property that different resolutions of the identity drawn from P1(*H*)*<sup>c</sup>* never share a projection (so that the key strategy proof of Lemma 6.7, which is based on the existence of overlapping bases, falls apart). Given some enumeration (*e* (1) *<sup>i</sup>* ),(*e* (2) *<sup>i</sup>* ),... of the countable set of all resolutions of the identity drawn from P1(*H*)*c*, so that each (*e* (*k*) <sup>1</sup> ,..., *e* (*k*) *<sup>n</sup>* ) is a basis of *H*, *k* ∈ N, each possible coloring *W* = *Wf* bijectively corresponds to some function *f* : N → {1,...,*n*} through

$$W\_f(e) = 1 \text{ if } e = e\_{f(k)}^{(k)};\tag{6.230}$$

$$W\_f(e) = 0 \text{ otherwise.}\tag{6.231}$$

Note that because of the total incompatibility of the projections, each *e* ∈ P1(*H*)*<sup>c</sup>* belongs to a unique resolution (*e* (*k*) *<sup>i</sup>* ), so that *Wf* is well defined. The statistical predictions of quantum mechanics may then be recovered as follows. For each density operator <sup>ρ</sup> <sup>∈</sup> <sup>D</sup>(*H*) we may define a probability measure μρ on the set *<sup>n</sup>*<sup>N</sup> of all functions *f* : N → {1,...,*n*} by imposing the conditions

$$\mu\_{\rho}\left(\left\{f\in\underline{\mathfrak{u}}^{\mathbb{N}}\mid W\_{f}(e\_{i}^{(k)})=\lambda\_{i}^{(k)}\,\forall i=1,\ldots,n, k\in K\right\}\right)=\prod\_{k\in K}\text{Tr}\left(\mathfrak{p}\prod\_{i=1}^{n}[e\_{i}^{(k)}=\lambda\_{i}^{(k)}]\right),\tag{6.232}$$

where λ(*k*) *<sup>i</sup>* ∈ {0,1}, *K* ⊂ N is finite, and [*e* (*k*) *<sup>i</sup>* <sup>=</sup> <sup>λ</sup>(*k*) *<sup>i</sup>* ] is the projection onto the corresponding eigenspace *<sup>H</sup>*λ(*k*) *i* of the projection *e* (*k*) *<sup>i</sup>* (more generally, for *a* ∈ *B*(*H*)sa we write [*a* = λ] for the spectral projection *e*<sup>λ</sup> defined by *a* and λ ∈ σ(*a*)). The subset of *n*<sup>N</sup> in the argument of μρ is hereby declared measurable; existence and uniqueness of the measure μρ on a suitable σ-algebra follow from the Kolmogorov extension theorem of measure theory, which applies because the marginals (6.232) satisfy the appropriate consistency conditions, cf. Hermens (2009) for details.

Notes 233

This formula guarantees that the left-hand side vanishes if λ(*k*) *<sup>i</sup>* = 0 for each *i*, and also if λ(*k*) *<sup>i</sup>* = 1 for more than one value of *i*. If *K* = {*k*0} is a singleton and λ = (λ1,...,λ*n*), then the right-hand side (and hence the left-hand side) is the Born probability for the outcome *e* (*k*0) *<sup>i</sup>* = λ*<sup>i</sup>* for each *i*, i.e.,

$$\mu\_{\mathcal{P}}\left(\{f \in \underline{\mathfrak{u}}^{\mathbb{N}} \mid W\_f(e\_i^{(k\_0)}) = \lambda\_i \forall i = 1, \dots, n\}\right) = \mathrm{Tr}\left(\mathfrak{p}\prod\_{l=1}^n [e\_l^{(k\_0)} = \lambda\_l]\right). \quad (6.233)$$

Consequently, it is true by construction that for any admissible measurement in quantum mechanics (in that all observables commute), i.e., for each *k*<sup>0</sup> ∈ N, averaging over the 'hidden variable' *<sup>f</sup>* <sup>∈</sup> *<sup>n</sup>*<sup>N</sup> reproduces the statistical predictions of quantum mechanics. This success is achieved at a high cost, however:


These facts were noted by Clifton & Kent themselves, and Appleby (2005) proved that they are a necessary feature of all constructions that involve sufficiently large subsets of P1(*H*) that can be colored.

Without challenging their mathematical significance, these discontinuities undermine any potential physical relevance such models might have, and this in turn challenges the reason such models were introduced in the first place (Meyer, 1999), namely the (alleged) *finite precision loophole* of the Kochen–Specker Theorem.

The thrust of this loophole is that it would be an illusion for an experimentalist like Alice to claim that she measures some observable *a* with infinite accuracy; in fact, given ε > 0 she might equally well measure some *a* with *a* − *a* < ε. Consequently, finding a dense colorable subset P1(*H*)*<sup>c</sup>* ⊂ P1(*H*) should suffice for a hidden variable interpretation of quantum mechanics, since if Alice believes she measures some projection *e*, the model assigns a value *W*(*e* ) to the projection *e* ∈ P1(*H*)*<sup>c</sup>* she actually measures (where *e* is selected by some algorithm that is part of the theory itself, cf. Clifton & Kent (2000)), and presents that value to Alice as the outcome of her measurement. However, owing to the discontinuities just mentioned, this value is as arbitrary as the identification of *e* .

As emphasized by Barrett & Kent (2004), this arbitrariness, although perhaps undesirable, does not by itself affect the ability of the Clifton–Kent model to reproduce the statistical predictions of quantum mechanics. On the other hand, it would be pretty awkward to have a theory whose individual value attributions are completely arbitrary, especially since the finite precision argument is predicated on the idea that observables close to the one Alice believes herself to measure (i.e., *e*) should have approximately the same value as the one she actually does measure (namely, *e* ). If this is not the case, her measurements are pointless and the hidden variable *Wf* would be empirically inaccessible and hence truly "hidden" (Appleby, 2005).

See also Hermens (2009, 2016). This last point applies to Corollary 6.12, which would no longer be true if the set *XA* of all bases of R<sup>3</sup> in Definition 6.11 would be replaced by some subset *X<sup>c</sup> <sup>A</sup>* <sup>⊂</sup> *XA* drawn from a colorable subset *<sup>S</sup>*<sup>2</sup> *<sup>c</sup>* of *<sup>S</sup>*2. Each *<sup>z</sup>* <sup>∈</sup> *XZ* would then correspond to some coloring <sup>u</sup> → *<sup>F</sup>*˜(u,*z*) of *<sup>S</sup>*<sup>2</sup> *<sup>c</sup>* , which, by the above discussion, would be maximally discontinuous and hence empirically inaccessible. Nonetheless, such a theory does exist in principle.

The aim of maximizing colorable sets was pursued in a different direction by Bub & Clifton (1996); see also Bub (1997). Given a "preferred" observable *a* ∈ *B*(*H*)sa and a pure state *e* ∈ P1(*H*), these authors look for a maximal sublattice P(*e*,*a*) of P(*H*) that contains all spectral projections of *a* (but, despite the notation P(*e*,*a*), does not necessarily contain *e*!), admits sufficiently many lattice homomorphism *h* : P(*e*,*a*) → {0,1} (i.e., binary valuations) such that the Born measure μ*<sup>e</sup>* on σ(*a*), i.e., μ*e*(Δ) = Tr(*ee*<sup>Δ</sup> ), Δ ⊆ σ(*a*), can be reproduced by averaging over these homomorphisms, and finally is invariant under all unitary isomorphisms of P(*H*) that commute with both *e* and *a*. Equivalently, one wants a maximal C\*-subalgebra *A*(*a*, *e*) of *B*(*H*) that contains *a*, admits sufficiently many dispersion-free states so as to reproduce the Born probabilities defined by *a* in the given state *e*, and is invariant in the said way (a fourth condition used by Bub and Clifton is superfluous; see Bub, 1997, p. 128). Asuming for simplicity that *n* = dim(*H*) < ∞, the answer is

$$A(a,e) = C^\*(e\_\lambda e e\_\lambda, \lambda \in \sigma(a))'\tag{6.234}$$

where, as always, *e*<sup>λ</sup> is the projection into the eigenspace *H*<sup>λ</sup> for λ ∈ σ(*a*), and the prime denotes the commutant (one might as well take the commutant of the *set* of all *e*<sup>λ</sup> *ee*<sup>λ</sup> ). Equivalently, putting *e* = *e*<sup>ψ</sup> = |ψψ|, eq. (6.234) is the C\*-algebra generated by all projections *f*<sup>λ</sup> onto the nonzero components *e*λψ of ψ in each *H*<sup>λ</sup> and all one-dimensional projections that are orthogonal to all *f*<sup>λ</sup> (given that dim(*H*) < ∞, this is the same as the linear span of these projections). Thus *A*(*a*, *e*) always contains *C*∗(*a*), since it contains each *e*<sup>λ</sup> , λ ∈ σ(*a*)), but note that *A*(*a*, *e*) need not be commutative. In comparison, if the requirement had been the reproduction of all Born probabilities for *arbitrary* pure states *e* rather than for some *given e*, the answer would have been any maximal abelian C\*-algebra in *B*(*H*) that contains *C*∗(*a*); if *a* has non-degenerate spectrum, this is just *C*∗(*a*) itself. The simplest possibility is

$$A(1\_H, e) = C^\*(e)' = \{e\}',\tag{6.235}$$

which is the linear span of all projections *f* ∈ P(*H*) for which either *e* ≤ *f* or *e* ≤ 1*<sup>H</sup>* − *f* (i.e., if *e* = *e*ψ, then either ψ ∈ *f H* or ψ ∈ (*f H*)⊥). In other words, we have *a* ∈ *A*(1*H*, *e*) iff ψ is an eigenvector of *a* (i.e. the eigenvector-eigenvalue link).

Each dispersion-free state on *A*(*a*, *e*), or, equivalently, each homomorphism *h*<sup>λ</sup> : P(*e*,*a*) → {0,1}, corresponds to one of the projections *f*<sup>λ</sup> through *h*<sup>λ</sup> (*f*<sup>λ</sup> ) = 1 and *h*<sup>λ</sup> (*f*) = 0 for all other one-dimensional projections *f* in P(*e*,*a*). The Born probabilities from *e* are then recovered by assigning (Born) measure Tr(*e f*<sup>λ</sup> ) to *h*<sup>λ</sup> .

Though interesting, this result mainly supports so-called modal interpretations of quantum mechanics, which we reject, since they tell us nothing physical about the measurement process and address the measurement problem only philosophically.

#### §6.2. The Free Will Theorem

The Free Will Theorem was published in two versions by Conway & Kochen (2006, 2009). Analogous results had previously been published by Heywood & Redhead (1983), Stairs (1983), Brown & Svetlichny (1990), and Clifton (1993), of which only the first paper was cited by Conway and Kochen. Moreover, the close relationship to Bell's (1964) Theorem might well be insisted on as a topic that should have been discussed in the original papers. Other critical literature (making the points listed in the preamble to this chapter) includes Bassi & Ghirardi (2007), 't Hooft (2007), Goldstein et al (2010), Wuthrich (2011), Hemmick & Shakur (2012), ¨ Cator & Landsman (2014), Hermens (2014, 2015), and Walleczek (2016).

The original (Strong) Free Will Theorem (FWT) states that three assumptions, called SPIN, TWIN, and MIN, imply that the response of a spin-one particle to the bipartite experiment with spin-one particles described above 'is not a function of properties of that part of the universe that is earlier than this response (. . . ).' Here SPIN and TWIN are the first and second half of our *Nature* axiom, whilst MIN expresses a form of context-locality as well as the loose assumption that Alice and Bob may 'freely choose' their settings *a* and *b*, respectively. Accordingly, in our notation, Conway and Kochen only use the parameter space *Z*, rather than the full space *X* we need in order to consistently axiomatize determinism. Their formulation contains an implicit assumption of determinism, whose precise nature only becomes clear from their proof, and which is akin to our formulation, except for the crucial difference that the function they allude to only acts on the particle variables and not on the settings of the experiment (of which, as already noted, Conway and Kochen just say that the experimenters can 'freely choose' them).

Conway and Kochen paraphrase their theorem as follows:

'if indeed we humans have free will, then elementary particles already have their own small share of this valuable commodity. More precisely, if the experimenter can freely choose the directions in which to orient his apparatus in a certain measurement, then the particles response (to be pedantic—the universe's response near the particle) is not determined by the entire previous history of the universe. (. . . ) our theorem asserts that if experimenters have a certain freedom, then particles have exactly the same kind of freedom. Indeed, it is natural to suppose that this latter freedom is the ultimate explanation of our own. (...) Granted our three axioms [i.e., the physical ones and freedom of choice], the Free Will Theorem shows that nature itself is nondeterministic.'

However, such far-reaching conclusions seem unwarranted by the actual technical content of the theorem. Indeed, though it is also assumed in Bell's first theorem (see §6.5 below), the conjunction of *Determinism* and *Freedom* is *a priori* is uncomfortable, especially since the main novelty of the FWT lies in the emphasis Conway and Kochen (unlike Bell) put on free will. The authors acknowledge at least this point already on the first page of their first paper (Conway & Kochen, 2006), in which they anticipate criticism of the kind:

"'I saw you put the fish in!" said a simpleton to an angler who had used a minnow to catch a bass.'

Indeed, also after more serious philosophical analysis, it has been concluded that:

'Their [Conway & Kochen's] case against determinism thus has all the virtues of theft over honest toil. It is truly indeterminism in, indeterminism out.' (Wuthrich, 2011) ¨

Our formulation of the FWT, in which the original allusion to undefined free will in allowing arbitrary settings of the experiment has been replaced by complete determinism including the settings, avoids this criticism.

To derive (6.35) - (6.38), we use (6.21) to write down the formulae

$$\begin{split}P\_{\boldsymbol{\Psi}\_{0}}(F\_{i}=1,G\_{j}=1|A=a,B=b) &= \langle \boldsymbol{\Psi}\_{0},(1\_{3}-|\mathbf{u}\_{i}\rangle\langle\mathbf{u}\_{i}|)\otimes(1\_{3}-|\mathbf{v}\_{j}\rangle\langle\mathbf{v}\_{j}|)\boldsymbol{\Psi}\_{0}\rangle; \\ P\_{\boldsymbol{\Psi}\_{0}}(F\_{i}=0,G\_{j}=0|A=a,B=b) &= \langle \boldsymbol{\Psi}\_{0},|\mathbf{u}\_{i}\rangle\langle\mathbf{u}\_{i}|\otimes|\mathbf{v}\_{j}\rangle\langle\mathbf{v}\_{j}|\boldsymbol{\Psi}\_{0}\rangle; \\ P\_{\boldsymbol{\Psi}\_{0}}(F\_{i}=1,G\_{j}=0|A=a,B=b) &= \langle \boldsymbol{\Psi}\_{0},(1\_{3}-|\mathbf{u}\_{i}\rangle\langle\mathbf{u}\_{i}|)\otimes|\mathbf{v}\_{j}\rangle\langle\mathbf{v}\_{j}|\boldsymbol{\Psi}\_{0}\rangle; \\ P\_{\boldsymbol{\Psi}\_{0}}(F\_{i}=0,G\_{j}=1|A=a,B=b) &= \langle \boldsymbol{\Psi}\_{0},|\mathbf{u}\_{i}\rangle\langle\mathbf{u}\_{i}|\otimes(1\_{3}-|\mathbf{v}\_{j}\rangle\langle\mathbf{v}\_{j}|)\boldsymbol{\Psi}\_{0}\rangle. \end{split}$$

For example, for any pair of unit vectors u,v we have

$$
\begin{split} \langle\langle\Psi\_{0},|\mathbf{u}\rangle\langle\mathbf{u}|\otimes|\mathbf{v}\rangle\langle\mathbf{v}|\Psi\_{0}\rangle &= \\ \frac{1}{3}\langle\mathbf{e}\_{1}\otimes\mathbf{e}\_{1} + \mathbf{e}\_{2}\otimes\mathbf{e}\_{2} + \mathbf{e}\_{3}\otimes\mathbf{e}\_{3}, \mathbf{u}|\otimes|\mathbf{v}\rangle\langle\mathbf{v}|(\mathbf{e}\_{1}\otimes\mathbf{e}\_{1} + \mathbf{e}\_{2}\otimes\mathbf{e}\_{2} + \mathbf{e}\_{3}\otimes\mathbf{e}\_{3})\rangle = \\ \frac{1}{3}\langle\mathbf{e}\_{1}\otimes\mathbf{e}\_{1} + \mathbf{e}\_{2}\otimes\mathbf{e}\_{2} + \mathbf{e}\_{3}\otimes\mathbf{e}\_{3}, \langle\mathbf{u},\mathbf{v}\rangle\mathbf{u}\otimes\mathbf{v}\rangle \\ = \frac{1}{3}\langle\mathbf{u},\mathbf{v}\rangle^{2}, \end{split}
$$

which gives (6.36). The other cases are similar.

The implications of the finite precision loophole of the Kochen–Specker Theorem for the Free Will Theorem were analyzed by Hermens (2014), who concluded that this loophole does not apply. We give a more precise argument to this effect.

We have dense colorable subsets *X<sup>c</sup> <sup>A</sup>* <sup>⊂</sup> *XA* and *<sup>X</sup><sup>c</sup> <sup>B</sup>* <sup>⊂</sup> *XB* <sup>=</sup> *XA*, where *<sup>X</sup><sup>c</sup> <sup>A</sup>* may or may not coincide with *X<sup>c</sup> <sup>B</sup>*. If not, the perfect correlation condition (6.54) in the *Nature* assumption cannot even be stated, but even if *X<sup>c</sup> <sup>A</sup>* = *X<sup>c</sup> <sup>B</sup>*, since finite precision of experiment has been declared to be an issue it would be quite out of character to impose (6.54). Instead, one needs a probabilistic version of this condition, of which it will turn out that it cannot be satisfied. As in the notes to the previous section, for each density matrix ρ one needs a probability measure μρ on *Z* that reproduces the statistical quantum-mechanical predictions for the associated quantum state. Compared to the notes to the previous section, the role of *W* is now played by *z*, in that for given *F* and *G* one might write

$$W(a,b) = (\hat{F}(a,z), \hat{G}(b,z). \tag{6.236}$$

This measure may be constructed analogously to (6.232), i.e., for any sequence (*a*(*k*) ) of bases drawn from *X<sup>c</sup> <sup>A</sup>*, any sequence (*b*(*k*) ) of bases drawn from *X<sup>c</sup> <sup>B</sup>*, and any sequences (λ(*k*) ) and (γ(*k*) ) in Λ, cf. (6.22), where *k* ∈ *K* ⊂ N is arbitrary, we define

$$\begin{aligned} \mu\_{\mathcal{P}}(\{z \in \mathcal{Z} \mid \hat{F}(a^{(k)}, z) = \lambda^{(k)}, \hat{G}(b^{(k)}, z) = \gamma^{(k)}, k \in K\} &= \\ \prod\_{k \in K} \text{Tr} \left( \rho \prod\_{i,j=1}^{3} [J\_{\mathbf{u}\_{i}}^{2} = \lambda\_{i}^{(k)}] \cdot [J\_{\mathbf{v}\_{j}}^{2} = \gamma\_{j}^{(k)}] \right), \end{aligned} \tag{6.237}$$

Notes 237

where, as in the main text,

$$a = (\mathbf{u}\_1, \mathbf{u}\_2, \mathbf{u}\_3);\tag{6.238}$$

$$b = (\mathbf{v}\_1, \mathbf{v}\_2, \mathbf{v}\_3). \tag{6.239}$$

Note that *J*<sup>2</sup> <sup>u</sup>*<sup>i</sup>* acts on Alice's Hilbert space <sup>C</sup><sup>3</sup> whilst *<sup>J</sup>*<sup>2</sup> <sup>v</sup>*<sup>j</sup>* acts on Bob's. In particular, for fixed *k*<sup>0</sup> ∈ *K* and λ, γ ∈ Λ, we have the special case of (6.237) for compatible measurements, viz.

$$\mu\_{\mathcal{P}}(\{z \in Z \mid \hat{F}(a^{(k\_0)}, z) = \lambda, \hat{G}(b^{(k\_0)}, z) = \gamma\} = \text{Tr}\left(\rho \prod\_{i,j=1}^3 [J\_{\mathbf{u}\_i}^2 = \lambda\_i] \cdot [J\_{\mathbf{v}\_j}^2 = \lambda\_j]\right),$$

where in the main text we would have written *P*<sup>ρ</sup> (*F* = λ,*G* = μ|*A* = *a*,*B* = *b*) for the right-hand side. Hence for the correlated state ρ = |ψ0ψ0| we obtain from (6.42):

$$\mu\_{\Psi0}(\{z \in Z \mid \hat{F}\_l(a, z) \neq \hat{G}\_j(b, z)\}) = \frac{2}{3}(1 - \langle \mathbf{u}\_l, \mathbf{v}\_j \rangle^2),\tag{6.240}$$

which of course vanishes if <sup>u</sup>*<sup>i</sup>* <sup>=</sup> <sup>v</sup>*j*. If the expression 1− u*i*,v*j*<sup>2</sup> appearing here is small, then the projections *e*u*<sup>i</sup>* and *e*v*<sup>j</sup>* are close (in norm), since

$$\left\|e\_{\mathbf{u}\_{l}} - e\_{\mathbf{v}\_{f}}\right\|^{2} \leq 2(1 - \langle \mathbf{u}\_{l}, \mathbf{v}\_{f} \rangle^{2}). \tag{6.241}$$

Eq. (6.240) therefore allows us to make rigorous sense of Hermens' (2014) heuristic idea that the assumption (6.54) in the FWT should be modified as follows:

'if *e*u*<sup>i</sup>* <sup>−</sup>*e*v*<sup>j</sup>* is small, then in most of the cases *<sup>F</sup>*<sup>ˆ</sup> *<sup>i</sup>*(*a*,*z*) = *G*ˆ *<sup>j</sup>*(*b*,*z*).'

Namely, we replace (6.54) by the following approximate correlation condition:

• For every <sup>ε</sup> <sup>&</sup>gt; 0 there is <sup>δ</sup> <sup>&</sup>gt; 0 such that if 1− u*i*,v*j*<sup>2</sup> <sup>&</sup>lt; <sup>δ</sup>, then

$$
\mu\_{\Psi0}(\{z \in Z \mid \hat{F}\_l(a, z) \neq \hat{G}\_f(b, z)\}) < \mathfrak{e}.\tag{6.242}
$$

Indeed, if the theory existed, on could simply take δ = ε. However, a theory satisfying (6.242) does not exist, as can be proved by contradiction: if *F*ˆ *<sup>i</sup>*(*a*,*z*) = *G*ˆ *<sup>j</sup>*(*b*,*z*) for all pairs (u*i*,v*j*) such that 1 − u*i*,v*j*<sup>2</sup> <sup>&</sup>lt; <sup>ε</sup>, then the proof of Theorem 6.13 shows not only that (6.32) still holds on the modified *Nature* assumption (so that *<sup>F</sup>*˜(·,*z*) again defines a coloring of *<sup>S</sup>*2), but that *in addition* we have

$$1 - \langle \mathbf{u}, \mathbf{u}' \rangle^2 < \delta \implies \tilde{F}(\mathbf{u}, z) = \tilde{F}(\mathbf{u}', z). \tag{6.243}$$

In particular, the apparently weaker correlation condition ending with (6.242) is actually *stronger* than its exact counterpart (6.54).

Thus Theorem 6.13 still holds on this revised *Nature* assumption, so that unlike the Kochen–Specker Theorem, the Free Will Theorem is immune to the finite precision loophole. The price for this immunity is that, quite against the spirit of the FWT, some probabilistic reasoning had to be invoked, so that the difference between the FWT and Bell's first theorem has blurred even further.

#### §6.3. Philosophical intermezzo: Free will in the Free Will Theorem

The literature on free will is immense. Introductory accounts include Walter (2001), which focuses on the connection with neuroscience, Doyle (2011), and Beebee (2013), the second of which remains largely philosophical, the third even completely. A very sophisticated recent defense of compatibilism is Ismael (2016). Lewis's 'local miracle compatibilism' was proposed in Lewis (1981). What's more:

'[Lewis's paper is] the finest essay that has ever been written in defense of compatibilism possibly the finest essay that has ever been written about any aspect of the free will problem.' (van Inwagen, 2008).

Saunders (1968) already made a point similar to Lewis's; see also Moore (1912, Ch. 6). For Lewis's theory of counterfactuals see Lewis (1973, 1979, 2000), as well as Menzies (2014). See also Fischer (1994), Beebee (2003, 2013), and Vihvelin (2013).

Although Lewis's position is called *local miracle compatbilism*, a miracle takes place neither in the actual world where Alice's hand is at rest nor in the possible world where she raises it, i.e., a law is broken neither in the former nor in the latter:

'This is what Lewis means by a 'miracle': an event *M* is a miracle if and only if *M* occurs at *possible world w*, and *M* is contrary to some *actual* law (or combination of laws) *L*. The point here is that while *M* is a miracle in Lewis's sense, it is not contrary to any of *w*'s laws of nature. At *w*, *L* simply isn't a law in the first place. So, as things *actually* happened in the *actual* world—*L* is a law, and *m* does not occur, so there is no miracle in the usual sense of 'miracle'. *m* is only a 'miracle' in Lewis's special sense of 'miracle': something (*m*) happens in *w* that is contrary to the laws of nature in the *actual* world.' (Beebee, 2013, p. 62)

Unfortunately, confusion may arise if the quotation in the main text 'if I did it, a law would be broken' from Lewis (1981) is subjected to the following explanation:

'On Lewis's account of counterfactuals, the *truth conditions* for counterfactuals—what makes them true—are as follows. Suppose we have the counterfactual 'if *A* had been the case, *B* would have been the case' (so if *A* is 'I miss the bus' and *B* is 'I'm late', this counterfactual just says, 'if I'd missed the bus, I would have been late'). This counterfactual will be true if and only if, *at the closest possible world to the actual world* at which *A* is true, *B* is also true. So, our sample counterfactual, 'if I'd missed the bus, I would have been late', is true if and only if: *at the closest possible world to the actual world* at which I miss the bus, I'm late.' (Beebee, 2013, p. 60).

Removing any possible remaining doubt, on p. 62 she mentions that the closest possible world where I miss the bus is the world *w*. According to this explanation, then, Lewis's sentence 'if I did it, a law would be broken', would mean that *at the closest possible world to the actual world* in which I did it, a law *is* broken, i.e., in *w*. But according to Beebee's definition quoted in the main text of what Lewis means by a miracle, apparently this is not the right reading (and indeed it would, in our view, be nonsensical). Moreover, Lewis (1981) emphasizes that in the first bullet point in the main text above—which he defends—it is not the agent who would break a law, whereas in the second bullet point —rejected by Lewis—it is; in the first it is the breaking of some law at an earlier time that enables the agent to do what she, in our actual world, did not do. Thus Lewis's phrasing seems awkward.

Our development of Lewis's argument is indebted to Vihvelin (2013, pp. 164– 165), who (re)states Lewis's first bullet point as the following conjunction:


A second way in which Alice could (counterfactually) have raised here hand is through an instant (counterfactual) modification of the state of the world, as in Bennett (1984). This has been explicated by Vihvelin (2013, p. 165), too:


Here we prefer to write Different Past, since even though in this scenario the state indeed (by determinism) would have been different all the way back to the Big Bang, the entire trajectory of the world may or may not be close to the actual one. In this scenario, the two cases Lewis distinguishes take the form in the main text.

Since the main novelty of their papers lies in the emphasis on free will, the reader might wonder what Conway & Kochen themselves have to say about the subject. As we can read in the delightful biography of Conway by Roberts (2015), or watch in his video lectures on the Free Will Theorem (Conway, 2009), free will is indeed of great importance to at least the first author of the theorem. Unfortunately, his interest in free will seems unaccompanied by any philosophical sophistication, e.g.:

'Compatibilism in my view is silly. Sorry, I shouldn't just say straight off that it is silly. Compatibilism is an old viewpoint from previous centuries when philosophers were talking about free will. The were accustomed to physical theory being deterministic. And then there's the question: How can we have free will in this deterministic universe? Well, they sat and thought for ages and ages and ages and read books on philosophy and God knows what and they came up with compatibilism, which was a tremendous wrenching effect to reconcile 2 things which seemed incompatible. And they said they were compatible after all. But nobody would *ever* have come up with compatibilism if they thought, as turns out to be the case, that science wasn't deterministic. The whole business of compatibilism was to reconcile what science told you at the time, centuries ago down to 1 century ago: Science appeared to be totally deterministic, and how can we reconcile that with free will, which is not deterministic? So compatibilism, I see it as out of date, really. It's doing something that doesn't need to be done. However, compatibilism hasn't gone out of date, certainly, as far as the philosophers are concerned. Lots of them are still very keen on it. How can I say it? If you do anything that seems impossible, you're quite proud when you appear to have succeeded. And so really the philosophers don't want to give up this notion of compatibilism because it seems to damned clever. But my view is it's really nonsense. And it's not necessary. So whether it actually is nonsense or not doesn't matter.'

(Conway, quoted in Roberts, 2015, pp. 361–362).

Finally, our version of van Inwagen's (1975) Consequence Argument is due to Beebee (2003), and the novel parts of this section are based on Landsman (2016c). For interesting philosophical criticism of this approach, see De Mola (2016).

#### §6.4. Technical intermezzo: The GHZ-Theorem

The GHZ Theorem appeared in Greenberger et al (1990) See also Clifton, Redhead, & Butterfield (1991) and Bub (1997). Innumerable variations on and generalizations of such arguments may be given, leading to equally many Free Will Theorems. All of these have their roots in algebraic properties of matrices, which hidden variable theories (in vain) try to reproduce.

#### §6.5. Bell's theorems

The original contributions to the theme of this section are Bell (1964, 1976), of which the first is one of the most famous papers of 20th century theoretical physics. Since there are more than 10,000 papers citing Bell (1964) alone, it is impossible to discuss all literature relevant to Bell's work. What we call his first theorem originates with Bell (1964), which incidentally was written after Bell (1966), but our treatment of the settings (taken from Cator & Landsman, 2014) is different. Though originally motivated as an attempt to make the Free Will Theorem look less of a *petitio principii*, it also addresses a problem Bell faced even according to some of his staunchest supporters (Norsen, 2009; Seevinck & Uffink, 2011), namely the tension between the idea that the hidden variables (in the pertinent causal past) should on the one hand include all ontological information relevant to the experiment, but on the other hand should leave Alice and Bob free to choose any settings they like.

His second theorem comes from Bell (1976), followed by Bell (1990a). Apart from his own papers, which are reprinted in Bell, Gottfried & Veltman (2001), treatments of Bell's Theorems we regard as sound include Fine (1982), Jarrett (1984), Pitowsky (1989), van Fraassen (1991), Butterfield (1992a,b), Bub (1997), Werner, & Wolf (2001), Liang, Spekkens, & Wiseman (2011), Shimony (2013), Wiseman (2014), and Brown & Timpson (2015). Recent and mathematically innovative approaches include Abramsky & Brandenburger (2011), Ac´ın et al (2015), and Fritz (2016). For history, see Gilder (2008) and Kaiser (2010).

Unfortunately, we have not been able to come to grips with (and hence do not cite) literature claiming that Bell's theorems are false, or have nothing to do with hidden variables, or prove that quantum mechanics (if not nature itself!) is nonlocal *per se*, or that he never changed his mind and only has one theorem saying it all.

The verification of (6.102) - (6.105) is analogous to the above computations deriving (6.35) - (6.38). In terms of the unit vector

$$\nu\_a = \begin{pmatrix} \cos a \\ \sin a \end{pmatrix},\tag{6.244}$$

the observable *F* Alice measures on setting *A* = *a* is the projection *ea* = |*vava*|, and similarly for Bob. Hence the corresponding Born probabilities are given by

$$\begin{aligned} P\_{\mathbb{W}\_0}(F=1, G=1 | A=a, B=b) &= \langle \mathbb{W}\_0, e\_a \otimes e\_b \mathbb{W}\_0 \rangle; \\ P\_{\mathbb{W}\_0}(F=0, G=0 | A=a, B=b) &= \langle \mathbb{W}\_0, (1\_2 - e\_a) \otimes (1\_2 - e\_b) \mathbb{W}\_0 \rangle; \\ P\_{\mathbb{W}\_0}(F=1, G=0 | A=a, B=b) &= \langle \mathbb{W}\_0, e\_a \otimes (1\_2 - e\_b) \mathbb{W}\_0 \rangle; \\ P\_{\mathbb{W}\_0}((F=0, G=1 | A=a, B=b) &= \langle \mathbb{W}\_0, (1\_2 - e\_a) \otimes e\_b \mathbb{W}\_0 \rangle. \end{aligned}$$

Notes 241

For example, we have

$$
\begin{split}
\langle\langle\Psi\_{0},e\_{a}\otimes e\_{b}\Psi\_{0}\rangle\rangle &= \frac{1}{2}\langle\mathbf{e}\_{1}\otimes\mathbf{e}\_{1}+\mathbf{e}\_{2}\otimes\mathbf{e}\_{2}, |\nu\_{a}\rangle\langle\nu\_{a}|\otimes|\nu\_{b}\rangle\langle\nu\_{b}|(\mathbf{e}\_{1}\otimes\mathbf{e}\_{1}+\mathbf{e}\_{2}\otimes\mathbf{e}\_{2})\rangle\\ &= \frac{1}{2}\langle\mathbf{e}\_{1}\otimes\mathbf{e}\_{1}+\mathbf{e}\_{2}\otimes\mathbf{e}\_{2}, (\cos a\cos b+\sin a\sin b)\nu\_{a}\otimes\nu\_{b}\rangle\\ &= \frac{1}{2}(\cos a\cos b+\sin a\sin b)^{2}\\ &= \frac{1}{2}\cos^{2}(a-b).
\end{split}
$$

The CHSH-inequality (6.117) is due to Clauser, Horne, Shimony, & Holt (1969). The definitive (i.e., loophole-free) experimental verification of its violation in nature is Henson et al. (2015). A direct proof starts of (6.117) from the simpler inequality

$$P(F \neq H) \le P(F \neq G) + P(G \neq H),\tag{6.245}$$

for three {0,1}-valued random variables *F*,*G*,*H*, which implies (6.117). To prove (6.245), one just writes

$$\begin{aligned} P(F \neq H) &= P(F = 1, G = 1, H = 0) + P(F = 1, G = 0, H = 0) \\ &+ P(F = 0, G = 1, H = 1) + P(F = 0, G = 0, H = 1), \end{aligned}$$

etc., and notes that each term on the left-hand side of (6.245) also occurs on the righthand side. Since each term lies in [0,1] and hence is positive, this implies (6.245). Our proof of Proposition 6.17 follows Werner & Wolf (2001), as does our proof of Theorem 6.18 (though not our formulation thereof, which once again derives from Cator & Landsman (2014). This proof shows that, as first noted by Fine (1982) and analyzed more deeply in Butterfield (1992b), there is no real distinction between the possibility of reproducing given (empirical) probabilities *P*(*F* = λ,*G* = γ|*A* = *a*,*B* = *b*) *that satisfy Bell locality* by a *local deterministic hidden variable theory* or by a *local stochastic hidden variable theory*. Most current research in this direction, sparked by Popescu & Rohlich (1994), is therefore concerned with theories defined by formal joint conditional probabilities that satisfy a no signaling condition like OI instead of Bell locality, cf. Bub (2011b) and Brunner et al (2014) for reviews.

Formal conditional probabilities of the kind that Bell's second theorem uses have been axiomatized by e.g. Popper (1938) and Renyi (1955); the following axioms are ´ theorems if conditional probabilities are defined a la Kolmogorov by (1.1). Let ` Σ be some σ-algebra and let F ⊂ Σ\{0/} be an ideal in Σ in the sense that if *B* ∈ Σ and *C* ∈ F, then *B*∩*C* ∈ F. A *conditional probability* on (Σ,F) is a map

$$P: \Sigma \times \mathcal{P} \to [0, 1]; \tag{6.246}$$

$$(A, C) \mapsto P(A|C),\tag{6.247}$$

such that:


Van Fraassen (1991) noted that if (6.121) holds, then the variable *x* is a *common cause* in the sense of Reichenbach for Alice's and Bob's outcomes (see Hofer-Szabo´ (2015) for a recent paper in this direction). To explain this observation, suppose two random processes *F* and *G* (like Alice's and Bob's measurements) are correlated, i.e., *P*(*F* = λ,*G* = γ) = *P*(*F* = λ)*P*(*G* = γ). What might cause the correlation?


$$P(F=\lambda, G=\gamma|X=x) = P(F=\lambda|X=x)P(G=\gamma|X=x). \tag{6.248}$$

Another way to write this is *P*(*F* = λ|*G* = γ,*X* = *x*) = *P*(*F* = λ|*X* = *x*), which shows that a common cause *X* screens off the dependence of *F* on *G*. Often the common cause is hidden and has to be inferred from the observed correlation (having excluded other explanations, like the ones above). A nice example of this is the inference of a manuscript called *Q* in New Testament studies. It is clear that the Gospels of Matthew and Luke both draw on Mark, but they also contain strikingly similar or even identical non-Markan passages. For various reasons it is unlikely that either one copied these from the other, so that the main hypothesis is that they both rely on *Q*, which is now lost. See e.g. Mack (1993).

From this perspective, the amazing fact is that the correlations in the Alice and Bob experiment with either spin-1 particle or photons cannot be explained by a common cause, since its existence (in the form of *x*) would imply the Bell inequality. However, of the four other explanations described above, no. 1 is ridiculous given the statistics of the relevant experiments, no. 2 is at odds with relativity, and no. 4 seems inapplicable. This leaves no. 3, which seems only supported by 't Hooft (2016), who denies the independence assumptions (i.e. between the settings and the state of the pair of particles undergoing measurement) lying at the basis of both the Free Will Theorem and Bell's theorems. Every way you look at it you lose!

Notes 243

Generalizations of Theorem 6.19 to operator algebras were given e.g. by Baez (1987), Raggio (1988), Werner (1989), and Bacciagaluppi (1993), as follows. Let *A* and *<sup>B</sup>* be unital C\*-algebras, with projective tensor product *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* (i.e., the completion of the algebraic tensor product *A*⊗*B* in the maximal C\*-cross-norm), cf. §C.13; the choice of the projective tensor product guarantees that each state on *A* ⊗ *B* extends to a state on *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* by continuity; conversely, since *<sup>A</sup>*⊗*<sup>B</sup>* is dense in *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>*, each state on the latter is uniquely determined by its values on the former. In particular, product states <sup>ρ</sup> <sup>⊗</sup><sup>σ</sup> and mixtures <sup>ω</sup> <sup>=</sup> <sup>∑</sup>*<sup>i</sup> pi*ρ*i*⊗σ*<sup>i</sup>* thereof are well defined on *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>*. If *A* ⊂ *B*(*H*1) and *B* ⊂ *B*(*H*2) are von Neumann algebras, and all states considered are normal, it is easier to work with the *spatial* tensor product *A*⊗*B*, defined as the double commutant (or weak completion) of *A*⊗*B* in *B*(*H*<sup>1</sup> ⊗*H*2). Any *normal* state on *<sup>A</sup>*⊗*<sup>B</sup>* extends to a normal state on *<sup>A</sup>*⊗*<sup>B</sup>* by continuity. Below we use <sup>⊗</sup><sup>ˆ</sup> , but the results also work for ⊗. In what follows, *A* and *B* are *unital* C\*-algebras.

## Definition 6.23. *Let* <sup>ω</sup> *be a state on A*⊗<sup>ˆ</sup> *B.*


An uncorrelated state ω is pure precisely when it is a product of pure states. This has the important consequence that both its restrictions ω|*<sup>A</sup>* and ω|*<sup>B</sup>* to *A* and *B*, respectively, are pure as well (the restriction <sup>ω</sup>|*<sup>A</sup>* of a state <sup>ω</sup> on *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* to, say, *<sup>A</sup>* is given by ω|*A*(*a*) = ω(*a*⊗1*B*), where 1*<sup>B</sup>* is the unit element of *B*, etc.). A correlated *pure* state has the property that its restriction to *A* or *B* is *mixed*.

Proposition 6.24. *The following conditions are equivalent:*


For the proof see Takesaki (2002), Theorem 4.14.

Corollary 6.25. *Correlated states exist iff A and B are both noncommutative.*

As one might expect, this result is closely related to the Bell inequalities:

Proposition 6.26. *For any* <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*A*⊗<sup>ˆ</sup> *<sup>B</sup>*)*, the following conditions are equivalent:*


$$|\mathfrak{a}(a\_1(b\_1+b\_2)+a\_2(b\_1-b\_2))| \le 2.\tag{6.249}$$

See Baez (1987), Raggio (1988), Bacciagaluppi (1993), and Landsman (2006a).

Corollary 6.27. *If A or B is commutative, then* (6.249) *holds for all states* ω*.*

An elegant geometric approach to the Bell inequalities was developed by Pitowsky (1989, 1994), which we now summarize (also cf. Werner & Wolf, 2001).

Suppose we have a bipartite experiment with *m* different settings *A* = *a*1,...*am* and *B* = *b*1,...,*bm* on each wing, and binary outcomes, i.e., in {0,1}. We now denote the probability *P*(*F* = 1|*A* = *ai*) that *F*(*ai*) (i.e. the particular property measured by experiment *F* at setting *ai*) is true by *pi* (*i* = 1,...,*m*), and likewise we write *pj*+*<sup>m</sup>* for *P*(*G*|*B* = *bj*), i.e., the probability that *G*(*bj*) is true, once again for *j* = 1,...,*m*. Furthermore, we abbreviate the probability that *F*(*ai*) and *G*(*bj*) are both true by

$$p\_{i,j+m} \equiv P(F=1, G=1 | A = a\_i, B = b\_j) \; (i, j = 1, \dots, m). \tag{6.250}$$

The 2*m*+*m*<sup>2</sup> "surface probabilities" p = (*p*1,..., *p*2*m*, *p*1,*m*+1,..., *pm*,2*m*) form a vector in R2*m*+*m*<sup>2</sup> , which we wish to constrain by the following assumption: there is a fact of the matter underlying each experiment according to which the pair (*F*(*ai*),*G*(*bj*)) already had a truth value for each possible setting (*ai*,*bj*), independently of any measurement being carried out or not ("*local realism*"). Thus the probabilities p (which now arguably have an ignorance interpretation) must lie in the convex polytope in R|2*m*+*m*2<sup>|</sup> defined as the convex hull *Cm* of the following set of (extreme) points: for each 2*m*-tuple λ = (λ1,...,λ2*m*), where λ*<sup>i</sup>* ∈ {0,1}, define

$$\mathbf{x}\_{\mathbb{A}} = (\mathbb{A}\_1, \dots, \mathbb{A}\_{2m}, \mathbb{A}\_1 \cdot \mathbb{A}\_{m+1}, \dots, \mathbb{A}\_m \cdot \mathbb{A}\_{2m}) \in \mathbb{R}^{2m + m^2},\tag{6.251}$$

i.e., the entry at place *k* is λ*<sup>k</sup>* (*k* = 1,...,2*m*) and the entry at place (*i*, *j*) is λ*<sup>i</sup>* ·λ*m*+*j*, where *i*, *j* = 1,...,*m*. The interpretation of this is that x<sup>λ</sup> represents the particular fact of the matter where *F*(*ai*) has truth value λ*<sup>i</sup>* and *G*(*bj*) has truth value λ*m*+*j*, so that their conjunction (*F*(*ai*),*G*(*bj*)) has truth value λ*<sup>i</sup>* · λ*m*+*j*. In this state the probability of the said configuration is one and all other states have probability zero; arbitrary probability assignments then lie in *Cm*. The point, then, is to characterize the convex polytope *Cm* <sup>⊂</sup> <sup>R</sup>2*m*+*m*<sup>2</sup> through a finite set of inequalities, which turn out to be generalized Bell inequalities. Seeing this result requires some background.

Let *V* be a real topological vector space with (continuous) dual *V*∗; if *V* = R*<sup>n</sup>* we may also put *<sup>V</sup>*<sup>∗</sup> <sup>=</sup> <sup>R</sup>*<sup>n</sup>* and write <sup>ϕ</sup>(*v*) as an inner product ϕ, *<sup>v</sup>* in what follows.

1. Any (not necessarily convex) subset *<sup>S</sup>* <sup>⊂</sup> *<sup>V</sup>* has a *polar <sup>S</sup><sup>o</sup>* <sup>⊂</sup> *<sup>V</sup>*<sup>∗</sup> defined by

$$S^o = \{ \mathfrak{g} \in V^\* \mid \mathfrak{g}(\nu) \le 1 \,\forall \nu \in \mathcal{S} \},\tag{6.252}$$

which is a closed convex subset of *V*∗. If *S* = *K* is a compact convex set, we have

$$K^0 = \{ \mathfrak{g} \in V^\* \mid \mathfrak{g}(\nu) \le 1 \,\forall \nu \in \partial\_\epsilon K \}. \tag{6.253}$$

2. The *bipolar theorem* (cf. e.g. Simon (2011, Theorem 5.5) states that

$$\mathcal{S}^{\text{oo}} = \text{co}(\mathcal{S} \cup \{0\}). \tag{6.254}$$

In particular, if *K* a closed convex set containing the origin, then

Notes 245

$$K^{00} = K,\tag{6.255}$$

and hence, if *K<sup>o</sup>* is a compact convex set, we may reconstruct *K* from *K<sup>o</sup>* as

*<sup>K</sup>* <sup>=</sup> {*<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* <sup>|</sup> <sup>ϕ</sup>(*v*) <sup>≤</sup> <sup>1</sup>∀<sup>ϕ</sup> <sup>∈</sup> <sup>∂</sup>*eK<sup>o</sup>*}. (6.256)

3. In particular, if *K* is a convex polytope in a finite-dimensional vector space containing the origin, then so is *Ko*. In that case, ∂*eK<sup>o</sup>* is a finite set and so points in *K* are characterized by a *finite* set of *linear* inequalities (6.256), which describe the faces of the polytope. In this case, the associated (dual) description of *K* is called the *Minkowski–Weyl Theorem*, see e.g. Paffenholz (2010) for applications.

For example, among the five Platonic solids (i.e. in R3) the cube and the octahedron are dual to each other, as are the dodecahedron and the icosahedron, whereas the terahedron is self-dual. *A propos*, the latter arises as the convex polytope *C*<sup>1</sup> for *m* = 1 in the above story: clearly 2*m*+*m*<sup>2</sup> = 3, and for the vertices of *C*<sup>1</sup> one takes the four points x<sup>λ</sup> ensuing from the four possibilities λ = (0,0),(1,0),(0,1),(1,1), i.e., x<sup>λ</sup> = (0,0,0),(1,0,0),(0,1,0),(1,1,1). Then the inequalities in (6.256) are

$$p\_{1,2} \ge 0, \ p\_1 \ge p\_{1,2}, \ p\_2 \ge p\_{1,2}, \ p\_1 + p\_2 - p\_{1,2} \le 1. \tag{6.257}$$

For *<sup>m</sup>* <sup>=</sup> 2 the ensuing convex polytope *<sup>C</sup>*<sup>2</sup> <sup>⊆</sup> <sup>R</sup><sup>8</sup> is the convex hull of 16 extreme points, whose inequalities may be found in Pitowsky (1989, p. 27); these imply the CHSH inequality, whose violation in quantum mechanics therefore shows that the probabilities in question have no local realistic model.

More generally, suppose we have *n* yes-no experiments (*E*1,...,*En*) and some subset *Sn* of the set {(*i*,*k*) | 1 ≤ *i* < *k* ≤ *n*} (above we had *n* = 2*m*, *Ei* = *F*(*ai*) for *i* = 1,...,*m*, *Em*+*<sup>j</sup>* = *G*(*bj*)for *j* = 1,...,*m*, and *Sn* = {(*i*,*m*+ *j*)| 1 ≤ *i*, *j* ≤ *m*}). This gives surface probabilities (*p*1,..., *pn*, *pi*,*k*), where (*i*, *k*) ∈ *Sn*), which form a vector p in R*n*+|*Sn*<sup>|</sup> . As in (6.251), each truth assignment λ = (λ1,...,λ*n*), λ ∈ {0,1}, then defines a point <sup>x</sup><sup>λ</sup> <sup>∈</sup> <sup>R</sup>*n*+|*Sn*<sup>|</sup> with coordinates (λ1,...,λ*n*,λ*<sup>i</sup>* · <sup>λ</sup>*k*), where once again (*i*, *<sup>k</sup>*) <sup>∈</sup> *Sn*. This set of 2*<sup>n</sup>* points in turn spans a convex polytope *CSn* characterized by inequalities following from the dual characterization (6.256). Classical thinking would constrain the p so as to lie in *CSn* , and indeed we have p ∈ *CSn* iff there is a probability space (*X*,G,μ) such that *pi* = μ(*Ai*) and *pi*,*<sup>k</sup>* = μ(*Ai* ∩ *Ak*) for certain events *Ai* ∈ Σ, cf. Theorem 2.3 in Pitowsky (1989), which is based on Fine (1982).

Some authors claim on this basis that Bell-type inequalities have nothing to do with physics, but surely the point is that some physical assumptions (notably local realism) have to be made in order to justify the "classical thinking" behind *CSn* .

#### §6.6. The Colbeck–Renner Theorem

This section is based on Colbeck & Renner (2011, 2012a, 2012b), where the main idea originates (alas with unclear assumptions and at best heuristic "proofs"), Braunstein & Caves (1990), who provided steps 1 and 2 of the proof, and Landsman (2015), whom we follow closely. See also Leegwater (2016) for a technically different approach (by a far more complicated argument, Leegwater seems to manage to do without our CP assumption, i.e., continuity of probabilities).

## Chapter 7 Limits: Small *h*¯

Limits are essential to the asymptotic Bohrification program. It was recognized at an early stage in the development of quantum mechanics that the limit *h*¯ → 0 of Planck's constant going to zero should play a role in the derivation of classical physics from quantum theory, and later on also the thermodynamic limit (which often means "lim*N*→∞", where *N* is the number of particles in the system) became a subject of interest in quantum statistical mechanics. The conceptual status of these limits will be discussed in Chapter 10; in the present one we mainly explain the underlying mathematics. However, one question needs to be addressed immediately, since it is a source of much confusion. Varying *N* seems a realistic thing to do in the lab or on paper, whereas *h*¯ is a *constant*, so how can it be varied? The answer is that *h*¯ is a *dimensionful* constant, from which one forms dimensionless combinations of *h*¯ and other parameters; this combination then re-enters the theory as if it were a dimensionless version of *h*¯ that can indeed be varied. The oldest example is Planck's radiation formula *<sup>E</sup>*<sup>ν</sup> /*N*<sup>ν</sup> <sup>=</sup> *<sup>h</sup>*ν/(*eh*<sup>ν</sup>/*kT* <sup>−</sup> <sup>1</sup>), with temperature *<sup>T</sup>* as the pertinent variable. Indeed, the observation of Einstein and Planck that in the limit *h*¯ν/*kT* → 0 this formula converges to the classical equipartition law *E*<sup>ν</sup> /*N*<sup>ν</sup> = *kT* may well be the first use of the *h*¯ → 0 limit of quantum theory; note that Einstein put *h*¯ν/*kT* → 0 by letting ν → 0 at fixed *T* and *h*¯, whereas Planck took *T* → ∞ at fixed ν and *h*¯!

Another example is the Hamiltonian *<sup>h</sup>* <sup>=</sup> <sup>−</sup> *<sup>h</sup>*¯ <sup>2</sup> <sup>2</sup>*<sup>m</sup>* Δ +*V*(*x*) in the Schrodinger equa- ¨ tion of non-relativistic quantum mechanics, where *m* is the mass of the pertinent particle. Here one may pass to dimensionless parameters by introducing an energy scale ε typical of *H*, like ε = sup*<sup>x</sup>* |*V*(*x*)|, as well as a typical length scale -, such as - = ε/sup*<sup>x</sup>* |∇*V*(*x*)| (if these quantities are finite). In terms of the dimensionless variable ˜*x* = *x*/-, the rescaled Hamiltonian *h*˜ = *h*/ε is then dimensionless and equal to *<sup>h</sup>*˜ <sup>=</sup> <sup>−</sup>˜ *h*¯ 2 Δ˜ +*V*˜(*x*˜), where ˜ *h*¯ = *h*¯/-<sup>√</sup>2*m*ε, the operator <sup>Δ</sup>˜ is the Laplacian for ˜*x*, and *V*˜(*x*˜) = *V*(*x*˜)/ε. Here ˜ *h*¯ is dimensionless, and one might study the regime where it is small. Similarly, it is often realistic to rescale the potential *V* by a positive number <sup>λ</sup>, in which case *<sup>h</sup>*<sup>λ</sup> <sup>=</sup> <sup>−</sup> *<sup>h</sup>*¯ <sup>2</sup> <sup>2</sup>*<sup>m</sup>* <sup>Δ</sup> <sup>+</sup> <sup>λ</sup>*V*(*x*) can be rescaled to *<sup>h</sup>*<sup>λ</sup> /<sup>λ</sup> <sup>=</sup> <sup>−</sup> ˜ *h*¯ 2 <sup>2</sup>*<sup>m</sup>* Δ +*V*(*x*), with ˜ *h*¯ = *h*¯/ √ <sup>λ</sup>, so that the "large *<sup>V</sup>* limit" <sup>λ</sup> <sup>→</sup> <sup>∞</sup> comes down to ˜ *h*¯ → 0.

In (older) textbooks on quantum mechanics the limit *h*¯ → 0 is typically studied using the so-called WKB-approximation. This may be justified on historical grounds, but in fact this approximation is rarely applicable, and is extremely delicate even when it applies. Fortunately, a much more satisfactory and almost universally applicable framework has become available since the 1990s, namely *(strict) deformation quantization*, where the word "strict" (which we will henceforth omit) refers to the fact that in this approach *h*¯ is a real number that can "really" (!) be varied and hence can be made small (as opposed to *formal* deformation quantization, where *h*¯ is a formal parameter having no actual value). Also, "strict" sometimes refers to the use of C\*-algebras and the high mathematical standards this brings. In the formalism that follows, (deformation) quantization and the classical limit of quantum mechanics are seen as two sides of the same coin, as the axioms of quantization are predicated on recovering the correct classical limit, while conversely the classical limit only makes sense in the context of some correct notion of quantization.

The starting point of deformation quantization is a phase space *X*, mathematically described as a Poisson manifold, i.e., a manifold equipped with a Poisson bracket {·,·} on its algebra of smooth functions *<sup>C</sup>*∞(*X*), see §3.2. We recall that a Poisson bracket is a Lie bracket on *C*∞(*X*) with the additional property that for each *<sup>h</sup>* <sup>∈</sup>*C*∞(*X*), the map <sup>δ</sup>*h*(*f*) = {*h*, *<sup>f</sup>* } is a derivation of *<sup>C</sup>*∞(*X*) with respect to its structure as a commutative algebra under pointwise multiplication, i.e.,

$$
\delta\_\hbar(fg) = f\delta\_\hbar(g) + \delta\_\hbar(f)g.\tag{7.1}
$$

Furthermore, like pointwise multiplication, the Poisson bracket preserves realvaluedness, i.e., if *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*,R) and *<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*X*,R), then also { *<sup>f</sup>*,*g*} ∈ *<sup>C</sup>*∞(*X*,R).

As early as 1925, Dirac noted the formal analogy between Poisson brackets of functions on phase space and commutators of operators on Hilbert space (i.e., [*a*,*b*] = *ab*−*ba*). Indeed, if *A* is any C\*-algebra, the commutator is a Lie bracket on *A*, and if we use [*a*,*b*] = *i*[*ab*−*ba*], then also self-adjointness is preserved (in that *a*∗ = *a* and *b*∗ = *b* implies that also [*a*,*b*] is self-adjoint, which fails to be the case for the commutator itself unless it vanishes). Thus [−,−] is a Lie bracket on *A*sa. Moreover, if for fixed *a* ∈ *A* we define δ*a*(*b*)=[*a*,*b*] , then we have the product rule

$$
\delta\_a(bc) = \delta\_a(b)c + b\delta\_a(c), \tag{7.2}
$$

which makes δ*<sup>a</sup>* : *A* → *A* a derivation. A problem arises if one wishes to restrict δ*<sup>a</sup>* to *A*sa, since this subspace is not stable under multiplication. This may be remedied by passing to the Jordan product (5.14), i.e., *a* ◦ *b* = <sup>1</sup> <sup>2</sup> (*ab*+*ba*), which is defined on *A*sa. If *a*<sup>∗</sup> = *a*, then δ*<sup>a</sup>* : *A*sa → *A*sa satisfies the rule (7.2) also with respect to ◦.

All this remains true if [−,−] is rescaled by a nonzero real number. Which number this should be was suggested by Schrodinger's construction of momentum and ¨ position operators on the Hilbert space *H* = *L*2(R) through the substitutions

$$p \leadsto \hat{p} = \frac{\hbar}{i} \frac{d}{dx} ; \tag{7.3}$$

$$
\hat{q} \leadsto \hat{q} = \mathfrak{x},
\tag{7.4}
$$

where "*x*" is the multiplication operator *m*id (with id(*x*) = *x*), i.e., ˆ*q*ψ(*q*) = *x*ψ(*x*); for the moment we will not be bothered by the fact that these operators are unbounded; let us say they are both defined on the domain *C*<sup>∞</sup> *<sup>c</sup>* (R) <sup>⊂</sup> *<sup>L</sup>*2(R).

This yields the *canonical commutation relations*(which formally hold on*C*<sup>∞</sup> *<sup>c</sup>* (R)):

$$\frac{i}{\hbar}[\hat{p}, \hat{q}] = 1\_H,\tag{7.5}$$

Noting the Poisson brackets (in which *p*,*q* are the coordinate functions on *X* = R2)

$$\{p,q\} = \mathbf{1}\_X,\tag{7.6}$$

it it clear that analogy should be between {−,−} and (*i*/*h*¯)[−,−]. Thus Dirac wrote:

'The strong analogy between the quantum P.B. defined by [(*i*/*h*¯) times the commutator] and the classical P.B. (...) leads us to make the assumption that the quantum P.B.'s, or at any rate the simpler ones of them, have the same values as the corresponding classical P.B.'s.'

Combined with Heisenberg's decisive idea that quantum mechanics should be an *Umdeutung* (i.e., reinterpretation) of classical mechanics, one is led to the idea that "quantization" should be given by a linear map

$$f \mapsto \mathcal{Q}\_{\hbar}(f),\tag{7.7}$$

where *f* is some (smooth) function on phase space *X* and *Qh*¯(*f*) is some operator on some "corresponding" Hilbert space, whose identification or construction is a separate problem (but for *X* = R<sup>2</sup> it should apparently be *L*2(R)), such that

$$\frac{i}{\hbar}[\mathcal{Q}\_\hbar(f), \mathcal{Q}\_\hbar(g)] = \mathcal{Q}\_\hbar(\{f, g\}),\tag{7.8}$$

at least for functions *<sup>f</sup>*,*<sup>g</sup>* <sup>∈</sup>*C*∞(*X*) with 'the simpler' Poisson brackets. If only to do justice to Schrodinger's example (7.3) - (7.4) with (7.5), one should also require ¨

$$\mathcal{Q}\_{\hbar}(1\_X) = 1\_H. \tag{7.9}$$

The act of quantization should also preserve the adjoint, i.e., writing *f* ∗(*x*) = *f*(*x*),

$$
\mathcal{Q}\_{\hbar}(f^\*) = \mathcal{Q}\_{\hbar}(f)^\*.\tag{7.10}
$$

Putting *h*¯ on the right-hand side of eqs. (7.5) and (7.8), Dirac (and similarly the *Dreimannerarbeit ¨* Born–Heisenberg–Jordan) concluded from these equations that:

'*classical mechanics may be regarded as the limiting case of quantum mechanics when h*¯ *tends to zero.*'

In the remainder of this chapter we try to do justice to this fabulous insight of Dirac's (and also of Born, Heisenberg, and Jordan, or even Planck, Einstein, and Bohr, none of whom seem to have quite appreciated the stupendous complexity of the claim).

#### 7.1 Deformation quantization

Recall Definition C.121 of a continuous bundle of C\*-algebras over some space *I*, which below is taken to be a subset of the unit interval [0,1] that contains 0 as an accumulation point (so one may have e.g. *I* = [0,1] itself, or *I* = (1/N)∪ {0}).

Definition 7.1. *A* deformation quantization *of a Poisson manifold X consists of a continuous bundle of C\*-algebras* (*A*,{ϕ*h*¯ : *A* → *Ah*¯ }*h*¯∈*I*) *over I, along with maps*

$$\mathcal{Q}\_{\hbar}: \tilde{A}\_0 \to A\_{\hbar} \ (\hbar \in I),\tag{7.11}$$

*where A*˜0 *is a dense subspace of A*<sup>0</sup> = *C*0(*X*)*, such that:*


$$0 \mapsto f;\tag{7.12}$$

$$
\hbar \mapsto \mathcal{Q}\_{\hbar}(f) \ (\hbar > 0); \tag{7.13}
$$

*4. For all f*,*<sup>g</sup>* <sup>∈</sup> *<sup>A</sup>*˜0 *one has the* Dirac–Groenewold–Rieffel condition

$$\lim\_{\hbar \to 0} \left\| \left| \frac{i}{\hbar} [\mathcal{Q}\_{\hbar}(f), \mathcal{Q}\_{\hbar}(g)] - \mathcal{Q}\_{\hbar}(\{f, g\}) \right| \right\|\_{\hbar} = 0. \tag{7.14}$$

It follows from the definition of a continuous bundle that continuity properties like

$$\lim\_{\hbar \to 0} \|Q\_{\hbar}(f)\| = \|f\|\_{\\*\*};\tag{7.15}$$

$$\lim\_{\hbar \to 0} \left\| \mathcal{Q}\_{\hbar}(f)\mathcal{Q}\_{\hbar}(\mathfrak{g}) - \mathcal{Q}\_{\hbar}(fg) \right\| = 0,\tag{7.16}$$

are automatically satisfied. Let us note that condition (7.9) is absent from this definition, because 1*<sup>X</sup>* ∈/ *C*0(*X*) whenever *X* is not compact, in which case typically also the C\*-algebras *Ah*¯ have no unit (see below). However, the given conditions turn out to be sufficiently powerful to produce the "right" examples. We give one of the main such examples without proof (the underlying analysis is quite forbidding). We put

$$A\_0 = C\_0(T^\*\mathbb{R}^n);\tag{7.17}$$

$$A\_{\hbar} = B\_0(L^2(\mathbb{R}^n)) \ (\hbar > 0),\tag{7.18}$$

where *T*∗R*<sup>n</sup>* ∼= R2*<sup>n</sup>* carries the canonical Poisson structure (3.34), and *Ah*¯ is the C\* algebra of compact operators on the familiar Hilbert space *L*2(R*n*) of wave-functions on R*n*. For the sake of completeness we also mention that

$$A = \mathbf{C}\_r^\* \left( \left( \mathbb{R}^n \times \mathbb{R}^n \right)^T \right) \tag{7.19}$$

is the (reduced) C\*-algebra of the tangent groupoid (R*<sup>n</sup>* <sup>×</sup>R*n*)*<sup>T</sup>* to the pair groupoid <sup>R</sup>*<sup>n</sup>* <sup>×</sup>R*<sup>n</sup>* on <sup>R</sup>*n*, see §§C.16,C.19, where one may also find the maps <sup>ϕ</sup>*h*¯ .

#### 7.1 Deformation quantization 251

Let us summarize the situation. Continuity of the limit *h*¯ → 0 is hard to envisage if one merely has the classical phase space *X* = *T*∗R*<sup>n</sup>* and the quantum Hilbert space *L*2(R*n*) in mind. However, the move to either: the underlying Lie groupoids *<sup>T</sup>*R*<sup>n</sup>* and <sup>R</sup>*<sup>n</sup>* <sup>×</sup>R*n*, which jointly comprise the smooth tangent groupoid <sup>R</sup>*<sup>n</sup>* <sup>×</sup>R*n*)*<sup>T</sup>* , or: the corresponding canonically defined C\*-algebras *C*0(*T*∗R*n*) and *B*0(*L*2(R*n*)), which are glued together as a continuous bundle (7.17) - (7.19), does give rise to a satisfactory structure that makes the limit *h*¯ → 0 "continuous".

In this example, various possibilities for the quantization maps *Qh*¯ arise. As explained in §C.19, the groupoid structure underlying (7.17) - (7.18) suggests Weyl's prescription (C.549), which for convenience we reproduce:

$$\mathcal{Q}\_{\hbar}^{W}(f)\Psi(\mathbf{x}) = \int\_{T^{\*}\mathbb{R}^{n}} \frac{d^{n}pd^{n}\mathbf{y}}{(2\pi\hbar)^{n}} e^{ip(\mathbf{x}-\mathbf{y})/\hbar} \Psi(\mathbf{y}) f(\frac{1}{2}(\mathbf{x}+\mathbf{y}), \mathbf{p}),\tag{7.20}$$

where *f* lies in the image of*C*<sup>∞</sup> *<sup>c</sup>* (*T*R*n*) under the fiberwise Fourier transform (C.547). This image, then, is the space *A*˜0 in Definition 7.1. We may rewrite (7.20) as

$$\mathcal{Q}\_{\hbar}^{W}(f) = \int\_{T^\* \mathbb{R}^n} \frac{d^n p d^n q}{(2\pi\hbar)^n} f(q, p) \mathcal{Q}\_{\hbar}^{W}(q, p), \tag{7.21}$$

where the operators in the integrand are given by

$$
\Omega\_{\hbar}^{W}(q,p)\Psi(\mathbf{x}) = 2^{n}e^{2ip(\mathbf{x}-q)/\hbar}\Psi(2q-\mathbf{x}).\tag{7.22}
$$

The purpose of (7.21) is that for each <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R*n*) we then obviously have

$$
\langle \Psi, \mathcal{Q}\_{\hbar}^{W}(f)\Psi \rangle = \int\_{T^\*\mathbb{R}^n} \frac{d^n p d^n q}{(2\pi\hbar)^n} f(q, p) W\_{\hbar}^{\Psi}(p, q), \tag{7.23}
$$

where *W*<sup>ψ</sup> *<sup>h</sup>*¯ : *<sup>T</sup>*∗R*<sup>n</sup>* <sup>→</sup> <sup>R</sup> is the *Wigner function*, given by

$$W\_{\hbar}^{\Psi}(p,q) = \hbar^{-n} \langle \Psi, \Omega\_{\hbar}^{W}(q,p)\Psi \rangle \tag{7.24}$$

$$=\int\_{\mathbb{R}^n} d^n \nu e^{ip\nu} \overline{\Psi(q+\frac{1}{2}\hbar\nu)} \Psi(q-\frac{1}{2}\hbar\nu). \tag{7.25}$$

If ψ <sup>=</sup> 1, then *<sup>W</sup>*<sup>ψ</sup> *<sup>h</sup>*¯ gives a "phase space portrait" of the corresponding pure state *e*<sup>ψ</sup> on *B*0(*L*2(R)). However, this portrait cannot be interpreted as a probability density on *T*∗R*n*, since the Wigner function is not necessarily positive. This reflects a problem with Weyl's quantization map *QW <sup>h</sup>*¯ itself (at fixed *h*¯ > 0). We say that *Qh*¯ as introduced in (7.11) is *positive* if, for each *<sup>f</sup>* <sup>∈</sup> *<sup>A</sup>*˜0 <sup>⊂</sup> *<sup>A</sup>*<sup>0</sup> (seen as a C\*-algebra),

$$f \ge 0 \Rightarrow \mathcal{Q}\_{\hbar}(f) \ge 0,\tag{7.26}$$

where positivity of *Qh*¯(*f*) is defined in the C\*-algebra *Ah*¯ (which in the case at hand is *B*0(*L*2(R*n*))). This is not the case for *QW <sup>h</sup>*¯ . Moreover, *QW <sup>h</sup>*¯ fails to be continuous, and for this reason it cannot be extended to *A*<sup>0</sup> (at least not in the obvious way, viz. by continuity). Fortunately, both problems can be resolved by a change in *Qh*¯ .

A strict deformation quantization of R<sup>2</sup> that *is* positive exists under the name of *Berezin quantization*, denoted by *QB <sup>h</sup>*¯ . However, the fundamental idea of the underlying coherent states goes back to Schrodinger. For each ¨ (*p*,*q*) <sup>∈</sup> <sup>R</sup><sup>2</sup> and *<sup>h</sup>*¯ <sup>&</sup>gt; 0, define a unit vector φ(*p*,*q*) *<sup>h</sup>*¯ <sup>∈</sup> *<sup>L</sup>*2(R), called a *coherent state*, by

$$\boldsymbol{\phi}\_{\hbar}^{(p,q)}(\mathbf{x}) = (\boldsymbol{\pi}\hbar)^{-n/4} e^{-ipq/2\hbar} e^{ip\mathbf{x}/\hbar} e^{-\left(\mathbf{x}-q\right)^{2}/2\hbar}.\tag{7.27}$$

Writing *z* = *p*+*iq*, the transition probability between two coherent states is

$$|\langle \phi\_{\hbar}^{(z)}, \phi\_{\hbar}^{(z')} \rangle|^2 = e^{-|z - z'|^2/2\hbar}.\tag{7.28}$$

In terms of these coherent states, we define *Q<sup>B</sup> <sup>h</sup>*¯ : *<sup>C</sup>*0(*T*∗R*n*) <sup>→</sup> *<sup>B</sup>*0(*L*2(R*n*)) by

$$\mathcal{Q}\_{\hbar}^{\mathbb{B}}(f) = \int\_{T^\* \mathbb{R}^n} \frac{d^n p d^n q}{2\pi\hbar} \, f(p, q) |\phi\_{\hbar}^{(p, q)}\rangle \langle \phi\_{\hbar}^{(p, q)}|,\tag{7.29}$$

where the integral is meant in the sense that for each <sup>ψ</sup>,<sup>ϕ</sup> <sup>∈</sup> *<sup>L</sup>*2(R*n*) we have

$$
\langle \langle \boldsymbol{\Phi}, \boldsymbol{\mathcal{Q}}\_{\hbar}(f) \boldsymbol{\Psi} \rangle = \int\_{\mathbb{R}^{2n}} \frac{d^n p d^n q}{2 \pi \hbar} \, f(p, q) \langle \boldsymbol{\Phi}, \boldsymbol{\phi}\_{\hbar}^{(p, q)} \rangle \langle \boldsymbol{\phi}\_{\hbar}^{(p, q)}, \boldsymbol{\Psi} \rangle. \tag{7.30}
$$

In particular, for each unit vector <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R*n*) we may write

$$
\langle \Psi, Q\_{\hbar}(f)\Psi \rangle = \int\_{T^\*\mathbb{R}^n} d\mu\_{\Psi} f,\tag{7.31}
$$

where μψ is the probability measure on *T*∗R*<sup>n</sup>* with density

$$B\_{\hbar}^{\Psi}(p,q) = |\langle \phi\_{\hbar}^{(p,q)}, \Psi \rangle|^2,\tag{7.32}$$

called the *Husimi function* of <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(R*n*); in other words, μψ is given by

$$d\mu\_{\Psi}(p,q) = \frac{d^n p d^n q}{2\pi\hbar} B\_{\hbar}^{\Psi}(p,q). \tag{7.33}$$

Weyl and Berezin quantization are related in many ways, for example, by

$$Q^{B}\_{\hbar}(f) = Q^{W}\_{\hbar} \left( e^{\frac{\hbar}{4}\Delta\_{2n}} f \right), \tag{7.34}$$

where Δ2*<sup>n</sup>* = ∑*<sup>n</sup> <sup>j</sup>*=1(∂ <sup>2</sup>/∂ *p*<sup>2</sup> *<sup>j</sup>* + ∂ <sup>2</sup>/∂ (*q<sup>j</sup>* )2), from which it follows that Weyl and Berezin quantization are *asymptotically equal* in the sense that for any *<sup>f</sup>* <sup>∈</sup> *<sup>A</sup>*˜0,

$$\lim\_{\hbar \to 0} \left\| \mathcal{Q}\_{\hbar}^{B}(f) - \mathcal{Q}\_{\hbar}^{W}(f) \right\| = 0. \tag{7.35}$$

Indeed, this provides one way (among various others) of proving that *QB <sup>h</sup>*¯ satisfies Definition 7.1, where we note that even though *Q<sup>B</sup> <sup>h</sup>*¯ is defined on all of *<sup>C</sup>*0(*T*∗R*n*), eq. (7.14) only holds on a suitable dense subspace thereof, such as *C*<sup>∞</sup> <sup>0</sup> (*T*∗R*n*).

#### 7.2 Quantization and internal symmetry

In the presence of symmetries, Dirac's condition (7.8) can often be met by suitable functions *f* and *g* related to the symmetries in question, though such functions may be unbounded. This blasts the C\*-algebraic framework, but it does so in a controlled way. We start with internal symmetries, like spin, which will be coupled to motion in the next step. Let *<sup>G</sup>* be a Lie group with Lie algebra g, to which we associate:

• The "classical" *Lie–Poisson manifold* g∗, see (3.98), whose Poisson bracket we now preface with a minus sign, so that instead of (3.98) and (3.99) we now have

$$\{f, g\}\_{-}(\boldsymbol{\theta}) = -C\_{ab}^{c} \theta\_{c} \frac{\partial f(\boldsymbol{\theta})}{\partial \theta\_{a}} \frac{\partial g(\boldsymbol{\theta})}{\partial \theta\_{b}};\tag{7.36}$$

$$\{\hat{A},\hat{B}\}\_- = -[\overline{A},\overline{B}].\tag{7.37}$$

We write g<sup>∗</sup> <sup>−</sup> for this Poisson manifold.

• The "quantum-mechanical" reduced *group(oid) C\*-algebra C*<sup>∗</sup> *<sup>r</sup>* (*G*), cf. §C.18, defined as the norm-closure of π(*C*<sup>∞</sup> *<sup>c</sup>* (*G*)) within *B*(*L*2(*G*)), where

$$
\pi(\check{f})\Psi = \check{f} \* \Psi;\tag{7.38}
$$

$$\check{f} \* \Psi(\mathbf{x}) = \int\_G d\mathbf{y} \,\check{f}(\mathbf{x}\mathbf{y})\Psi(\mathbf{y}^{-1}),\tag{7.39}$$

where <sup>ˇ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G*) and <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(*G*), cf. (C.481), and *dy* is Haar measure on *<sup>G</sup>* (which also provides the measure defining the Hilbert space *L*2(*G*)).

We then obtain a continuous bundle of C\*-algebras, with fibers and total C\*-algebra

$$A\_0 = C\_r^\*(\mathfrak{g});\tag{7.40}$$

$$A\_{\omega\_1 \omega\_2} \omega\_{\omega\_1} \omega\_{\omega\_2} \qquad\tag{7.40}$$

$$A\_{\hbar} = \mathcal{C}\_r^\*(G) \ (\hbar > 0);\tag{7.41}$$

$$A = \mathbf{C}\_r^\*(\mathbf{G}^T),\tag{7.42}$$

where g is seen as an abelian Lie group under addition, cf. Theorem C.123. We have

$$\mathbf{C}\_r^\*\left(\mathfrak{g}\right) \cong \mathbf{C}\_0(\mathfrak{g}\_-^\*),\tag{7.43}$$

which isomorphism (i.e. of C\*-algebras) is given by the Fourier transform

$$f(\boldsymbol{\theta}) = \int\_{\mathfrak{g}} d^n A \, e^{-i\boldsymbol{\theta}(\boldsymbol{A})} \boldsymbol{f}(\boldsymbol{A});\tag{7.44}$$

$$\check{f}(A) = \int\_{\mathfrak{g}^\*} \frac{d^n \theta}{(2\pi)^n} e^{i\theta(A)} \, f(\theta), \tag{7.45}$$

where initially <sup>ˇ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G*), and the map <sup>ˇ</sup>*<sup>f</sup>* → *<sup>f</sup>* is subsequently extended to *<sup>C</sup>*<sup>∗</sup> *<sup>r</sup>* (*G*) by continuity. Here the normalization of Lebesgue measure *<sup>d</sup>nA* on <sup>g</sup> is arbitrary, but the normalization of *dn*θ is thereby fixed. In what follows, we take a (left-invariant) Haar measure *dx* on *G* and fix the normalization of *dnA* by the condition

$$J(0) = 1\tag{7.46}$$

in the definition of the Jacobian under the exponential map exp : g <sup>→</sup> *<sup>G</sup>*, i.e.,

$$J(A) = \frac{d(\exp(A))}{d^n A}.\tag{7.47}$$

With *A*˜0 = *C*<sup>∞</sup> *<sup>c</sup>* (g), the quantization map *Qh*¯ : *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (g) <sup>→</sup> *<sup>C</sup>*<sup>∗</sup> *<sup>r</sup>* (*G*) is then given by

$$Q\_{\hbar}(\check{f})(e^{A}) = \hbar^{-n}\check{f}(A/\hbar),\tag{7.48}$$

where *n* = dim(*G*) and we assume that *h*¯ > 0 is small enough that *h*¯ times the support of <sup>ˇ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (g) is contained in an open neighbourhood *<sup>U</sup>* of 0 <sup>∈</sup> <sup>g</sup> where the exponential map is a diffeomorphism onto some open neighbourhood *U* of *e* ∈ *G*; otherwise a cutoff function should be included. Equivalently, defining *<sup>A</sup>*˜0 <sup>⊂</sup> *<sup>C</sup>*0(g<sup>∗</sup> −) as the image of *C*<sup>∞</sup> *<sup>c</sup>* (g) under the Fourier transform <sup>ˇ</sup>*<sup>f</sup>* → *<sup>f</sup>* (which consists of the so-called Paley–Wiener functions on <sup>g</sup>∗), the map *Qh*¯ : *<sup>A</sup>*˜0 <sup>→</sup> *<sup>C</sup>*<sup>∗</sup> *<sup>r</sup>* (*G*) is given by

$$Q\_{\hbar}(f)(e^{A}) = \int\_{\mathfrak{g}^\*} \frac{d^n \theta}{(2\pi\hbar)^n} \, e^{i\theta(A)/\hbar} \, f(\theta). \tag{7.49}$$

Although these maps satisfy (7.14), if *G* is non-abelian there are no natural functions on g<sup>∗</sup> whose quantizations satisfy the exact Dirac condition (7.8). This is a limitation of the C\*-algebraic framework, since candidate functions like

$$
\hat{A}: \mathfrak{g}^\* \to \mathbb{R}; \tag{7.50}
$$

$$
\hat{A}(\theta) = \theta(A),
\tag{7.51}
$$

whose Poisson brackets (3.99) are promising, are unbounded. However, this is easily remedied by regarding *C*∗ *<sup>r</sup>* (*G*) as an algebra of bounded operators on the Hilbert space *L*2(*G*)—which indeed is the way it was originally defined—rather than abstractly. This "spatial" context allows the passage to the Lie algebra, as reviewed in §5.6, see especially (5.156) - (5.161). First note that (7.38) - (7.39) is a special case of (5.172), where *H* = *L*2(*G*) and *u* = *uL*, i.e., the *left-regular representation*

$$
\mu\_L(\mathbf{y})\Psi(\mathbf{x}) = \Psi(\mathbf{y}^{-1}\mathbf{x}).\tag{7.52}
$$

In this representation, the construction (5.156) then realizes g as right-invariant differential operators on the Garding domain ˚ *DG* <sup>⊂</sup> *<sup>C</sup>*∞(*G*). By definition of *<sup>C</sup>*<sup>∗</sup> *<sup>r</sup>* (*G*), seen as an operator on *L*2(*G*) the function *Qh*¯(*f*) is given in coordinates by

$$Q\_{\hbar}(f) = \int\_{\mathfrak{g}} d^n X \, J(X) \int\_{\mathfrak{g}^\*} \frac{d^n \mathfrak{G}}{(2\pi\hbar)^n} e^{i\theta(X)/\hbar} \, f(\theta) u\_L \left( \exp\left(\sum\_j X\_j T\_j\right) \right). \tag{7.53}$$

Here (*X*1,...,*Xn*)in (7.53) are coordinates on g defined by a basis choice (*T*1,...,*Tn*), i.e., *<sup>A</sup>* <sup>=</sup> <sup>∑</sup>*<sup>i</sup> XiTi*. The function *<sup>T</sup>*ˆ*<sup>j</sup>* on <sup>g</sup><sup>∗</sup> is then simply given by the coordinate function *<sup>T</sup>*ˆ*j*(θ) = <sup>θ</sup>*j*. Now take *<sup>A</sup>* <sup>∈</sup> <sup>g</sup> and assume that *<sup>f</sup>* <sup>=</sup> *<sup>A</sup>*ˆ. This function is unbounded, but the following formal calculation is rigorously correct on the Garding domain and ˚ may be justified by some distribution theory. For simplicity we assume that *G* is unimodular, in which case *<sup>J</sup>*(*X*) = <sup>1</sup>+*O*(*X*2) as *<sup>X</sup>* <sup>→</sup> 0, so that all first derivatives of *<sup>J</sup>* vanish at *X* = 0. Taking *f* = *T*ˆ*<sup>j</sup>* in (7.53) then gives

$$\begin{split} Q\_h(\hat{T}\_j) &= \int\_{\mathfrak{g}} d^n X J(X) \int\_{\mathfrak{g}^\*} \frac{d^n \theta}{(2\pi\hbar)^n} e^{i\theta(X)/\hbar} \,\theta\_j u\_L \left( \exp\left(\sum\_j X\_j T\_j\right) \right) \\ &= -i \int\_{\mathfrak{g}} d^n X J(\hbar X) u\_L \left( \exp\left(\hbar \sum\_j X\_j T\_j\right) \right) \frac{\partial}{\partial X\_j} \delta(X) \\ &= i \hbar u\_L'(X\_j), \end{split} \tag{7.54}$$

from which we obtain

$$Q\_{\hbar}(\hat{A}) = i\hbar u\_L'(A) = \mathfrak{x}\_L(A). \tag{7.55}$$

This explains the need for *minus* the Lie–Poisson bracket, since instead of (3.99) we now have (7.37), so that (5.160) gives the exact result (7.8) for *f* = *A*ˆ and *g* = *B*ˆ:

$$\frac{i}{\hbar}[\mathcal{Q}\_{\hbar}(\hat{A}), \mathcal{Q}\_{\hbar}(\mathcal{B})] = \mathcal{Q}\_{\hbar}(\{\hat{A}, \mathcal{B}\}\_{-}).\tag{7.56}$$

The minus sign in the Lie–Poisson bracket could have been avoided by writing <sup>ˇ</sup>*f*(−*A*/*h*¯) in (7.48), whose minus sign would have propagated into (5.159) and hence in the commutation relations (5.160), but the latter are so engrained in the physics literature that we see the minus sign on the bracket in (7.56) as the lesser evil.

Any continuous unitary representation *u*<sup>λ</sup> of *G* (where λ is some label) induces a representation *u* <sup>λ</sup> of *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G*) by (5.173), which may be extended to a representation of *C*∗(*G*) by continuity (the same is true for *C*∗ *<sup>r</sup>* (*G*) provided *u*<sup>λ</sup> is weakly contained in *<sup>L</sup>*2(*G*), cf. §C.18). This gives operators *<sup>u</sup>* (*Qh*¯(*f*)) which, by the same formal computation as for the case *<sup>u</sup>* <sup>=</sup> *uL* above, for *<sup>A</sup>* <sup>∈</sup> <sup>g</sup> rigorously give rise to operators

$$
\pi\_{\lambda}(A) = i\hbar u\_{\lambda}'(A),\tag{7.57}
$$

satisfying the like of (5.160) for fixed values of *h*¯ (but without control over the limit *h*¯ → 0). Many commutation relations in quantum mechanics take this form, where both irreducible and reducible representations *u* give rise to interesting examples. The reducible case typically comes from group actions and is best studied using the formalism of action groupoids reviewed in the next section, where we will see that further operators start playing a role. The irreducible case, on the other hand, gives rise to intriguing new examples of continuous bundles of C\*-algebras, where *h*¯ (now related the label λ) takes values in a discrete set and may be sent to zero, cf. §8.1.

#### 7.3 Quantization and external symmetry

We now generalize the setting of the preceding section from groups taken by themselves to group actions. Let a Lie group *G* act smoothly on some manifold *Q*; for example, we may have *Q* = R<sup>3</sup> with either *G* = *SO*(3) acting by rotations, or *G* = R<sup>3</sup> action by translations. We now take *<sup>X</sup>* <sup>=</sup> g<sup>∗</sup> <sup>×</sup>*Q*. Recalling the notation (3.71) and writing δ*<sup>a</sup>* ≡ δ*Ta* , we define the *action Poisson bracket*

$$\{f, g\} = -C\_{ab}^{c} \Theta\_{c} \frac{\partial f}{\partial \theta\_{a}} \frac{\partial g}{\partial \theta\_{b}} + \tilde{\xi}\_{a} f \frac{\partial g}{\partial \theta\_{a}} - \frac{\partial f}{\partial \theta\_{a}} \tilde{\xi}\_{a} g. \tag{7.58}$$

Interesting special cases arise if we take *<sup>A</sup>* <sup>∈</sup> g and define *<sup>A</sup>*<sup>ˆ</sup> <sup>∈</sup>*C*∞(g∗) as before, i.e., *<sup>A</sup>*ˆ(θ) = <sup>θ</sup>(*A*), now regarded as a function on g<sup>∗</sup> <sup>×</sup>*<sup>Q</sup>* (ignoring the *second* argument *<sup>q</sup>*). Similarly, if ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*Q*) we write <sup>ˆ</sup>*<sup>f</sup>* for the corresponding function on <sup>g</sup><sup>∗</sup> <sup>×</sup> *<sup>Q</sup>* (ignoring the *first* argument θ). This gives the coordinate-independent expressions

$$\{\hat{A},\hat{B}\} = -[\bar{A},\bar{B}];\tag{7.59}$$

$$\{\hat{A},\hat{f}\} = -\delta\_{\hat{A}}f;\tag{7.60}$$

$$\{\hat{f}, \hat{g}\} = 0.\tag{7.61}$$

Clearly, if *Q* is a point (with trivial *G*-action) we recover (minus) the Lie–Poisson structure on <sup>g</sup>∗. If, on the other hand, *<sup>Q</sup>* <sup>=</sup> <sup>R</sup><sup>3</sup> and *<sup>G</sup>* <sup>=</sup> <sup>R</sup><sup>3</sup> acts on *<sup>Q</sup>* by translation, i.e., a · x = x + a, we recover the canonical Poisson bracket (3.34), where the momenta *pa* (*a* = 1,...,*n*) are identified with the coordinates θ*<sup>a</sup>* on the dual of the Lie algebra of R3, which is just R<sup>3</sup> itself (with the usual basis (*e*1, *e*2, *e*3)). Therefore, the Poisson bracket (3.34) on R2*<sup>n</sup>* may be generalized in two ways:


A richer structure emerges if we keep *Q* = R<sup>3</sup> but now take *G* = *E*(3), i.e.,

$$E(\mathfrak{J}) = SO(\mathfrak{J}) \ltimes \mathbb{R}^3,\tag{7.62}$$

known as the *Euclidean group*. To explain its group structure, let some group *L* act on a vectors space *V*, seen as an abelian group under addition. Then the operations

$$(\mathcal{X}, \nu) \cdot (\mathcal{X}', \nu') = (\mathcal{X}\mathcal{X}', \nu + \mathcal{X} \cdot \nu');\tag{7.63}$$

$$(\mathcal{A}, \boldsymbol{\nu})^{-1} = (\mathcal{A}^{-1}, -\mathcal{A}^{-1} \cdot \boldsymbol{\nu}),\tag{7.64}$$

turn *G* = *LV* into a group, called the *semi-direct product* of *L* and *V*.

#### 7.3 Quantization and external symmetry 257

Then *E*(3) acts on R<sup>3</sup> in the obvious way, giving rise to the Poisson manifold <sup>g</sup><sup>∗</sup> <sup>×</sup>*<sup>Q</sup>* <sup>=</sup> <sup>R</sup><sup>3</sup> <sup>×</sup>R<sup>3</sup> <sup>×</sup>R<sup>3</sup> (since so(3) <sup>∼</sup><sup>=</sup> <sup>R</sup>3). We now also have generators (*J*1, *<sup>J</sup>*2, *<sup>J</sup>*3) of the Lie algebra of *SO*(3), with corresponding functions *J*ˆ *<sup>i</sup>*, as well as standard coordinate functions (*q*1,*q*2,*q*3) on *Q* = *R*3, giving rise to the Poisson brackets

$$\{\hat{J}\_i, \hat{J}\_j\} = -\mathfrak{e}\_{ijk}\hat{J}\_k; \ \{\hat{J}\_i, p\_j\} = -\mathfrak{e}\_{ijk}p\_k; \ \{p\_i, p\_j\} = 0;\tag{7.65}$$

$$\{\hat{J}\_i, q\_j\} = -\varepsilon\_{ijk} q\_k; \ \{p\_i, q\_j\} = \delta\_{ij}; \ \{q\_i, q\_j\} = 0. \tag{7.66}$$

The appropriate target C\*-algebra *C*∗ *<sup>r</sup>* (*G*,*Q*) for quantization is a generalization of *C*∗ *<sup>r</sup>* (*G*), constructed in a similar way, as explained in §C.18. For the moment it is enough to know that *C*∗ *<sup>r</sup>* (*G*,*Q*) is the completion of the function space *C*<sup>∞</sup> *<sup>c</sup>* (*G*×*Q*), seen as a ∗-algebra in the operations (C.526) - (C.527), in a suitable norm, namely

$$\|f\|\_{r} = \|\mathfrak{d}(f)\|,\tag{7.67}$$

where the representation ρ˜ : *C*<sup>∞</sup> *<sup>c</sup>* (*<sup>G</sup>* <sup>×</sup> *<sup>Q</sup>*) <sup>→</sup> *<sup>B</sup>*(*L*2(*<sup>G</sup>* <sup>×</sup> *<sup>Q</sup>*)) is given by (C.530). In case that *Q* has a *G*-invariant measure ν (still with support *Q*), the operator

$$\mathbb{W}: L^2(\mathbb{G} \times \mathcal{Q}) \to L^2(\mathbb{G} \times \mathcal{Q});\tag{7.68}$$

$$
\omega \Psi(\mathbf{x}, q) = \Psi(\mathbf{x}, \mathbf{x}^{-1} q), \tag{7.69}
$$

is unitary, and in terms of the notation

$$\mathfrak{A}(\mathbf{y}) = \mathfrak{w}\mathfrak{u}(\mathbf{y})\mathfrak{w}^\*,\ \mathfrak{A}(\tilde{f}) = \mathfrak{w}\mathfrak{a}(\tilde{f})\mathfrak{w}^\*,\ \mathfrak{f}(f) = \mathfrak{w}\mathfrak{p}(f)\mathfrak{w}^\*,\tag{7.70}$$

the formulae (C.528) - (C.530) take the slightly more appealing form

$$
\tilde{u}(\mathbf{y})\Psi(\mathbf{x},\mathbf{q}) = \Psi(\mathbf{y}^{-1}\mathbf{x}, \mathbf{y}^{-1}\mathbf{q});\tag{7.71}
$$

$$
\tilde{\pi}(\tilde{f})\Psi(\mathbf{x},q) = \tilde{f}(q)\Psi(\mathbf{x},q);\tag{7.72}
$$

$$
\tilde{\rho}(f)\Psi(\mathbf{x},q) = \int\_G d\mathbf{y}\, f(\mathbf{y},q)\Psi(\mathbf{y}^{-1}\mathbf{x},\mathbf{y}^{-1}q).\tag{7.73}
$$

The simplification thus gained especially concerns the position functions (7.72).

Analogously to (7.49), the quanitzation maps are given by

$$\mathcal{Q}\_{\hbar} \;:\; C\_0(\mathfrak{g}^\* \times \mathcal{Q}) \to C\_r^\*(G, \mathcal{Q});\tag{7.74}$$

$$\mathcal{Q}\_{\hbar}(f)(e^{A}, q) = \int\_{\mathfrak{g}^\*} \frac{d^n \theta}{(2\pi\hbar)^n} \, e^{i\theta(A)/\hbar} \, f(\theta, e^{-\frac{1}{2}A} \cdot q),\tag{7.75}$$

where, as in the pure group case, strictly speaking *f* must lie in the dense subspace of *<sup>C</sup>*0(g<sup>∗</sup> <sup>×</sup>*Q*) consisting of Paley–Wiener functions (in *<sup>A</sup>*) that are the Fourier transform (in the first argument) of functions that lie in *C*<sup>∞</sup> *<sup>c</sup>* (g×*Q*).

Computations similar to (7.54) then establish, for *<sup>A</sup>* <sup>∈</sup> g and ˜*<sup>f</sup>* <sup>∈</sup>*C*∞(*Q*) as before,

$$Q\_{\hbar}(\hat{A}) = i\hbar \tilde{u}'(A);\tag{7.76}$$

$$\mathcal{Q}\_{\hbar}(\hat{f}) = \mathfrak{A}(\tilde{f}).\tag{7.77}$$

Form these formulae and (7.59) - (7.60), it is easy to verify that Dirac's exact condition (7.8) holds in the following special cases:

$$\frac{i}{\hbar}[\mathcal{Q}\_{\hbar}(\hat{A}), \mathcal{Q}\_{\hbar}(\hat{B})] = \mathcal{Q}\_{\hbar}(\{\hat{A}, \hat{B}\});\tag{7.78}$$

$$\frac{i}{\hbar}[\mathcal{Q}\_\hbar(\hat{A}), \mathcal{Q}\_\hbar(\hat{f})] = \mathcal{Q}\_\hbar(\{\hat{A}, \hat{f}\});\tag{7.79}$$

$$\frac{i}{\hbar}[\mathcal{Q}\_{\hbar}(\hat{f}), \mathcal{Q}\_{\hbar}(\hat{g})] = \mathcal{Q}\_{\hbar}(\{\hat{f}, \hat{g}\}) = 0. \tag{7.80}$$

These might be regarded as infinitesimal versions of the covariance condition (C.514), specialized to the case at hand. We formalize this special case as follows.

Definition 7.2. *Let G be a locally compact group and let Q be a space equipped with some continuous G-action. A* system of imprimitivity (*u*(*G*),π(*C*0(*Q*))) *for the given group action G Q is a combination of a strongly continuous unitary representation u of G and a nondegenerate representation* π *of C*0(*Q*)*, both defined on the same Hilbert space, that for each x* <sup>∈</sup> *G and* ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*Q*) *satisfies*

$$
\mu(\mathfrak{x})\mathfrak{x}(\tilde{f})\mathfrak{u}(\mathfrak{x})^\* = \mathfrak{x}(\bar{L\_{\mathfrak{x}}f}).\tag{7.81}
$$

Here *L* \$*<sup>x</sup> <sup>f</sup>*(*q*) = ˜*f*(*x*−1*q*), as usual. We recall from §C.18 that such systems of imprimitivity bijectively correspond to degenerate representations ρ ≡ π *u* of *C*∗(*G*,*Q*) through (C.515), which in the special case (C.524) - (C.525) comes down to 

$$
\mathfrak{p}(f) = \int\_G d\mathbf{x} \,\mathfrak{m}(f(\mathbf{x}, \cdot)) \mathfrak{u}(\mathbf{x}).\tag{7.82}
$$

The formulae (7.71) - (7.73) define such a system of imprimitivity on the Hilbert space *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(*G*×*Q*). However, this cannot be the end result of quantization, since this space is typically reducible under the pair (*u*(*G*),π(*C*0(*Q*))), or, equivalently, under ρ(*C*∗(*G*,*Q*)). For example, this is the case for *G* = R<sup>3</sup> or *G* = *E*(3) acting on *<sup>Q</sup>* <sup>=</sup> <sup>R</sup><sup>3</sup> in the natural way discussed above, for which we obtain *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(R<sup>3</sup> <sup>×</sup>R3) or even *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(*E*(3) <sup>×</sup> <sup>R</sup>3). In the former case we do obtain the correct position operators *qi* , but for the momentum operators we find the curious expression <sup>−</sup>*ih*¯(∂/<sup>∂</sup> *<sup>x</sup><sup>i</sup>* <sup>+</sup> <sup>∂</sup>/∂*qi* )—to their credit, these do satisfy the canonical commutation relations (7.5), since these follow from (7.78) - (7.80), which in turn follow from the covariance condition (7.81) defining a system of imprimitivity.

Instead, we would prefer the Hilbert space *H* = *L*2(R3) expected from elementary quantum mechanics (without spin), equipped with the system of imprimitivity

$$
\mu(\mathbf{y})\Psi(q) = \Psi(\mathbf{y}^{-1}q);\tag{7.83}
$$

$$
\pi(\tilde{f})\Psi(q) = \tilde{f}(q)\Psi(q). \tag{7.84}
$$

The answer lies in the search for *irreducible* systems of imprimitivity (*u*(*G*),π(*C*0(*Q*))), or, equivalently, *irreducible* representations of ρ(*C*∗(*G*,*Q*)); see §7.5.

#### 7.4 Intermezzo: The Big Picture

First, however, we summarize and generalize the results in this chapter so far into what we call *The Big Picture*. This arose in the 1990s from efforts to relate Mackey's quantization theory based on systems of imprimitivity (which Mackey himself saw as the natural implementation of what he called *Weyl's Program*, i.e. the construction of the basic operators of quantum mechanics from group-theoretical considerations) to deformation quantization (and hence to the tradition started by Dirac, as continued by Groenewold, Moyal, Berezin, Flato, Rieffel, and others).

The Big Picture is technically based on the theory of *Lie groupoids* (already alluded to in the preceding sections) and *Lie algebroids*. For a precise definition of the former we refer to Definition C.115; briefly, a *groupoid G* is an object like a group, where however *multiplication* is defined only partially (although the *inverse* is defined for each element). To see which elements can be multiplied, one has maps *s*,*t* : *G*<sup>1</sup> → *G*<sup>0</sup> from the *total space G*<sup>1</sup> of the groupoid to its *base space G*0, such that the product *xy* ∈ *G*<sup>1</sup> of *x*, *y* ∈ *G*<sup>1</sup> is defined whenever *s*(*x*) = *t*(*y*), and satisfies *s*(*xy*) = *s*(*y*), *t*(*xy*) = *t*(*x*), and *s*(*x*−1) = *t*(*x*). Four relevant examples are:


$$
\Gamma\_1 = G \times \mathcal{Q}, \; \Gamma\_0 = \mathcal{Q}, \; s(\mathbf{x}, q) = \mathbf{x}^{-1} q, \; t(\mathbf{x}, q) = q,\tag{7.85}
$$

so that products (*x*,*q*)(*y*,*q* ) are defined iff *q* = *x*−1*q*, with result

$$(\mathbf{x}, q)(\mathbf{y}, \mathbf{x}^{-1} q) = (\mathbf{x}\mathbf{y}, q). \tag{7.86}$$

Finally, the inverse is (necessarily) given by

$$(\mathbf{x}, q)^{-1} = (\mathbf{x}^{-1}, \mathbf{x}^{-1} q). \tag{7.87}$$

A *Lie groupoid* is a groupoid *G* where *G*<sup>1</sup> and *G*<sup>0</sup> are manifolds and all operations are smooth. In all examples just given this requires *Q* to be a manifold, and in the last one *G* should be a Lie group, and the given action *G*×*Q* → *Q* must be smooth.

Generalizing the construction of a Lie algebra g from a given Lie group *<sup>G</sup>*, a Lie groupoid comes with an associated linearized (or "infinitesimal") structure, called a *Lie algebroid*. As in the group case, this differential-geometric notion can also be defined independently of its origin in the theory of Lie groupoids, as follows:

Definition 7.3. *A* Lie algebroid *E over a manifold Q is a vector bundle E* <sup>π</sup> → *Q with a vector bundle map E* <sup>α</sup> → *T Q (called the* anchor*), as well as with a Lie bracket* [, ] *on the space C*∞(*Q*,*E*) *of smooth cross-sections of E, satisfying the Leibniz rule*

$$[\sigma\_1, f \cdot \sigma\_2] = f \cdot [\sigma\_1, \sigma\_2] + (\alpha \circ \sigma\_1 f) \cdot \sigma\_2 \tag{7.88}$$

*for all* <sup>σ</sup>1,σ<sup>2</sup> <sup>∈</sup> *<sup>C</sup>*∞(*Q*,*E*) *and f* <sup>∈</sup> *<sup>C</sup>*∞(*Q*) *(here* <sup>α</sup> ◦σ<sup>1</sup> *is a vector field on Q).*

It follows that the map <sup>σ</sup> → <sup>α</sup> ◦<sup>σ</sup> : *<sup>C</sup>*∞(*Q*,*E*) <sup>→</sup> *<sup>C</sup>*∞(*Q*,*T Q*) induced by the anchor is a homomorphism of Lie algebras, where the latter is equipped with the usual commutator of vector fields (this homomorphism property used to be part of the definition of a Lie algebroid, but in fact it follows from the stated definition).

Lie algebroids generalize (finite-dimensional) Lie algebras as well as tangent bundles, and the (infinite-dimensional) Lie algebra *C*∞(*Q*,*E*) could be said to be of geometric origin in the sense that it derives from an underlying finite-dimensional geometrical object. Similar to the above list of examples of Lie groupoids, one has the following basic classes of Lie algebroids.


$$[\mathfrak{G}\_1, \mathfrak{G}\_2](q) = [\mathfrak{G}\_1(q), \mathfrak{G}\_2(q)]\_{\mathfrak{g}} + \delta\_{\mathfrak{G}\_2} \mathfrak{G}\_1(q) - \delta\_{\mathfrak{G}\_1} \mathfrak{G}\_2(q). \tag{7.89}$$

These examples may also be recovered as special cases of the following construction that canonically associates a Lie algebroid Lie(*G*) to a Lie groupoid *G*: as a vector bundle, Lie(*G*) is the restriction of ker(*t*∗) to *G*<sup>0</sup> (where *t*<sup>∗</sup> : *T G*<sup>1</sup> → *T G*<sup>0</sup> is the derivative map of the source projection *t* : *G*<sup>1</sup> → *G*0), and the anchor is α = *s*<sup>∗</sup> (one may alternatively define Lie(*G*) as the normal bundle to the object inclusion map *i* : *G*<sup>0</sup> → *G*1, cf. Definition C.115, but this makes the definition of the anchor a bit more complicated). As in the Lie group case, one may identify sections of Lie(*G*) with left-invariant vector fields on *G*, and under this identification the Lie bracket on *C*∞(*G*0,Lie(*G*)) is by definition given by the commutator of vector fields.

Conversely, one may ask whether a given Lie algebroid *E* is *integrable*, in that *E* ∼= Lie(*G*) for some Lie groupoid *G* (where the isomorphism sign ∼= means that a pertinent vector bundle isomorphism *E* ∼= ker(*t*∗)|*G*<sup>0</sup> should preserve all relevant structure). Unlike the special case of Lie groups (where Lie's Third Theorem 5.41 settles this in the positive), this is not necessarily the case, but that is of no concern. We now state a crucial connection between Lie algebroids and Poisson geometry.

Proposition 7.4. *The dual vector bundle E*∗ *of a Lie algebroid E is a Poisson manifold, whose Poisson bracket on C*∞(*E*∗) *is defined by the following special cases:*

$$\{f, g\} = 0 \ (f, g \in \mathcal{C}^\circ(\mathcal{Q}));\tag{7.90}$$

$$\{\mathfrak{d}, f\} = -\mathfrak{a} \circ \mathfrak{o} f \; (\mathfrak{o} \in \mathcal{C}^{\circ}(\mathcal{Q}, E), f \in \mathcal{C}^{\circ}(\mathcal{Q}));\tag{7.91}$$

$$\{\mathfrak{G}\_{\mathsf{l}}, \mathfrak{G}\_{\mathsf{2}}\} = -[\bar{\mathfrak{G}\_{\mathsf{l}}}, \bar{\mathfrak{G}\_{\mathsf{2}}}],\tag{7.92}$$

*where* <sup>σ</sup>˜ <sup>∈</sup>*C*∞(*E*∗) *is defined by a given section* <sup>σ</sup> *of E through the obvious pairing.*

*Conversely, if the dual F*<sup>∗</sup> *to a given vector bundle F* → *Q is a Poisson manifold such that the Poisson bracket of two linear functions is linear, then F* ∼= *E for some Lie algebroid E over Q, with the above Poisson structure on E*∗*.*

Following our earlier lists, the main examples are:


The following theorem displays a rich and physically relevant class of examples of Definition 7.1 of deformation quantization. The key point is that a Lie groupoid *G* defines both classical and quantum data, namely the (reduced) Lie groupoid C\* algebra *C*∗ (*r*) (*G*) (cf. §C.17) and the Poisson manifold Lie(*G*)<sup>∗</sup> (cf. Proposition 7.4), and these are continuously (even smoothly) related through the tangent groupoid *G<sup>T</sup>* (cf. Proposition C.117) and its associated Lie groupoid C\*-algebra *C*<sup>∗</sup> (*r*) (*GT* ).

Theorem 7.5. *For any Lie groupoid G, the bundle of C\*-algebras given by*

$$A\_0 = C\_0(\operatorname{Lie}(G)^\*) \text{ ( $\hbar = 0$ );}\tag{7.93}$$

$$A\_{\hbar} = \mathcal{C}^\*(G) \qquad \quad (0 < \hbar \le 1);\tag{7.94}$$

$$A = \mathbf{C}^\*(\mathbf{G}^T),\tag{7.95}$$

*defines a deformation quantization of the Poisson manifold* Lie(*G*)∗ *over I* = [0,1]*. The same statement holds for the corresponding* reduced *groupoid C\*-algebras.*

The key lemma for this theorem is Theorem C.123, which provides the continuity of the given bundle of C\*-algebras. A lengthy computation shows that also the Dirac– Groenewold–Rieffel condition (7.14) is met. In this light, the quantization of the phase space *<sup>T</sup>*∗R*<sup>n</sup>* in §7.1 then corresponds to the pair groupoid *<sup>G</sup>* <sup>=</sup> <sup>R</sup>*<sup>n</sup>* <sup>×</sup>R*<sup>n</sup>* on <sup>R</sup>*n*, the one in §7.2 follows from the special case where the Lie groupoid *G* is "simply" a Lie group, and the case of §7.3, which puts Mackey's quantization theory in a deformation framework, is obviously given by the action groupoid *G Q*. Finally, the space groupoid *G*<sup>0</sup> = *G*<sup>1</sup> = *Q* gives a trivial continuous bundle of C\*-algebras, where *Ah*¯ = *C*0(*Q*) for all *h*¯ ∈ [0,1], and *Q* carries the zero Poisson bracket.

#### 7.5 Induced representations and the imprimitivity theorem

Returning to §7.3, we recall the bijective correspondence between systems of imprimitivity (*u*(*G*),π(*C*0(*Q*))) and non-degenerate representations of the C\*-algebra *C*∗(*G*,*Q*) of the action groupoid defined by the given action *G Q*. This correspondence preserves irreducibility, and our task is to find irreducible representations.

It was recognized at least 50 years ago that this task can be carried out if the group action satisfies a certain regularity condition, and is hopeless otherwise. This is sometimes called the *Mackey–Glimm dichotomy*. The condition in question may be stated in a number of equivalent ways (whose equivalence is not at all obvious).

First, we recall some terminology from topology. Let *X* be a space. One calls *Y* ⊂ *Y* ⊆ *X relatively open* in *Y* if there is an open set *U* ⊂ *X* such that *Y* = *Y* ∩*U*. A subset *Y* ⊂ *X* is *locally closed* if each *y* ∈ *Y* has an open neighbourhood *U* in *X* such that *U* ∩*Y* is closed, and finally "*X* is *T*0" if for any two distinct points there is an open set that contains exactly one of them. Furthermore, each *q* ∈ *Q* defines a *G*-orbit through *q* denoted by *G*· *q*, as well as a stabilizer (or "little group")

$$G\_q = \{ \mathbf{x} \in G \mid \mathbf{x} \cdot q = q \}. \tag{7.96}$$

For any subgroup *H* ⊂ *G*, we denote the equivalence class of *x* in *G*/*H* by [*x*].

Definition 7.6. *A smooth action of a Lie group G on a manifold Q is called* regular *if one and hence each of the following equivalent conditions is satisfied:*


Probably the simplest example of a non-regular action is the action Z T given by

$$n: z \mapsto e^{2\pi i n\theta} z,\tag{7.97}$$

where θ ∈ R\Q (here Z may be seen as a zero-dimensional Lie group with infinitely many components—in fact, Definition 7.6 more generally applies to second countable locally compact groups and spaces that are "almost Hausdorff"). Indeed, each orbit is dense in T (but not open), and the orbit space T/Z has no proper open sets.

Theorem 7.7. *Let a group action G Q be regular. Then the irreducible representations of the associated action groupoid C\*-algebra C*∗(*G*,*Q*)*—and hence also the irreducible systems of imprimitivity* (*u*(*G*),π(*C*0(*Q*)))*—are classified up to unitary equivalence by pairs* (O,*u*<sup>χ</sup> )*, where* O *is a G-orbit in Q and u*<sup>χ</sup> *is an irreducible representation of the stabilizer Gq of an arbitrary point q* ∈ O*, with an explicit construction of the corresponding representation* ρ(O,*u*<sup>χ</sup> )(*C*∗(*G*,*Q*))*. Two such representations* ρ(O,*u*<sup>χ</sup> ) *and* ρ(<sup>O</sup>,*<sup>u</sup>* <sup>χ</sup> ) *are equivalent iff* O = O *and, given that q* = *xq and hence Gq* <sup>=</sup> *xGqx*−<sup>1</sup> *for some x* <sup>∈</sup> *G, u* <sup>χ</sup> *is unitarily equivalent to u*<sup>χ</sup> ◦ Ad(*x*)*. Finally, any irreducible representation* ρ *is unitarily equivalent to some* ρ(O,*u*<sup>χ</sup> )*.*

In the simplest case, *Q* is equal to a point, so that *C*∗(*G*,*Q*) = *C*∗(*G*), and we find that irreducible representations of *C*∗(*G*) (which are necessarily non-degenerate) bijectively correspond to unitary irreducible representations of *G*. In the next easiest case, *G* acts nontrivially but still transitively on *Q*, in which case the action is clearly regular and *Q* ∼= *G*/*H* through the *G*-equivariant map in no. 4 of the above definition (read in the opposite direction), i.e., we pick some *q*<sup>0</sup> ∈ *Q*, define *H* = *Gq*<sup>0</sup> , and finally map *Q* to *G*/*H* by *q* → [*x*], where *q* = *xq*<sup>0</sup> (this map is well defined); in that case, we might as well have assumed that *Q* = *G*/*H* to begin with. The following important corollary of Theorem 7.7 is called the *Imprimitivity Theorem*.

Corollary 7.8. *Up to unitary equivalence, irreducible representations of C*∗(*G*,*G*/*H*) *(or, equivalently, of pairs* (π(*C*0(*G*/*H*)),*u*(*G*)) *satisfying the covariance condition* (7.81)*) bijectively correspond to unitary irreducible representations of H.*

In preparation for the general case stated in Theorem 7.7, and also as a goal in itself, we first give an explicit construction of the irreducible representation ρ<sup>χ</sup> of *C*∗(*G*,*G*/*H*) corresponding to a given unitary irreducible representation *u*<sup>χ</sup> (*H*), where we label the unitary irreducible representations of *H* (up to unitary equivalence) by <sup>χ</sup> <sup>∈</sup> *<sup>H</sup>*<sup>ˆ</sup> (where *<sup>H</sup>*<sup>ˆ</sup> is the set of unitary equivalence classes of unitary irreducible representations of *H*, cf. §C.15 for the abelian case), and let the corresponding representation ρ<sup>χ</sup> (*C*∗(*G*,*G*/*H*))—or the pair π<sup>χ</sup> (*C*0(*G*/*H*)) and *u*<sup>χ</sup> (*G*) inherit this label (in raised form, in order to prevent confusion between *u*<sup>χ</sup> (*H*) and *<sup>u</sup>*<sup>χ</sup> (*G*)|*H*).

The construction of ρ<sup>χ</sup> (*C*∗(*G*,*G*/*H*))—or, equivalently, of a system of imprimitivity (π<sup>χ</sup> (*C*0(*G*/*H*)),*u*<sup>χ</sup> (*G*))—from *u*<sup>χ</sup> (*H*) proceeds by the technique of *induced representations* (which physicists may be familiar with from the representation theory of the Poincare group, see Theorem 7.9 below). We start from a specific realiza- ´ tion of *u*<sup>χ</sup> (*H*) on a Hilbert space *H*<sup>χ</sup> (which is finite-dimensional if *H* is compact or abelian). From this, we construct a new Hilbert space *H*<sup>χ</sup> , whose realization depends on the choice of a *quasi-invariant measure* ν on *G*/*H*, i.e., a (non-zero) measure whose null-sets are *G*-invariant in the sense that if ν(*A*) = 0 for some (Borel) measurable *A* ⊂ *G*/*H*, then also ν(*x* · *A*) = 0 for each *x* ∈ *G*. This will surely be the case if ν is *invariant*, i.e., if ν(*x* · *A*) = ν(*A*) for each measurable *A*, but invariant measures on *G*/*H* may not exist, whereas quasi-invariant measures always do.

We now consider (measurable) functions ψ : *G* → *H*<sup>χ</sup> that satisfy

$$
\Psi(\mathbf{x}h) = \boldsymbol{\mu}\_{\mathcal{Z}}(h^{-1})\Psi(\mathbf{x}),
\tag{7.98}
$$

for every *x* ∈ *G* and *h* ∈ *H*; equivalently, we may say that

$$
\mu\_{\mathcal{X}}(h) \circ \mathcal{R}\_h \Psi = \Psi,\tag{7.99}
$$

for each *h* ∈ *H*, where *Rh*ψ(*x*) = ψ(*xh*). Now if ψ and ϕ both satisfy (7.98), then, by unitarity of *u*<sup>χ</sup> , their inner product ϕ(*x*),ψ(*x*)*H*<sup>χ</sup> in *H*<sup>χ</sup> is *H*-invariant, in that

$$
\langle \langle \mathfrak{p}(\mathbf{x}h), \Psi(\mathbf{x}h) \rangle\_{H\_{\mathbb{Z}}} = \langle \mathfrak{p}(\mathbf{x}), \Psi(\mathbf{x}) \rangle\_{H\_{\mathbb{Z}}}.\tag{7.100}
$$

Hence the function *x* → ϕ(*x*),ψ(*x*)*H*<sup>χ</sup> , *a priori* defined from *G* to C, induces a function [*x*] → ϕ(*x*),ψ(*x*)*H*<sup>χ</sup> from *G*/*H* to C. We write the latter function as ϕ,ψ*H*<sup>χ</sup> [*x*]; in particular, taking <sup>ϕ</sup> <sup>=</sup> <sup>ψ</sup>, we write ψ<sup>2</sup> *<sup>H</sup>*<sup>χ</sup> [*x*] = ψ(*x*),ψ(*x*)*H*<sup>χ</sup> . We may then define a new Hilbert space *H*<sup>χ</sup> that consists of all measurable functions ψ : *G* → *H*<sup>χ</sup> that for each *h* ∈ *H* satisfy (7.98), and are square-integrable on *G*/*H*:

$$\int\_{G/H} d\mathbf{v}([\mathbf{x}]) \, ||\Psi||\_{H\_{\mathcal{X}}}^2 [\mathbf{x}]<\infty. \tag{7.101}$$

This space turns out to be complete in the natural inner product

$$
\langle \boldsymbol{\upvarphi}, \boldsymbol{\uppsi} \rangle = \int\_{G/H} d\boldsymbol{\upnu}([\boldsymbol{x}]) \, \langle \boldsymbol{\upvarphi}, \boldsymbol{\uppsi} \rangle\_{H\_{\mathcal{X}}} [\boldsymbol{x}] \tag{7.102}
$$

It also carries a system of imprimitivity: in case that ν is *G*-invariant we simply have

$$
\mu^{\mathcal{X}}(\mathbf{y})\Psi(\mathbf{x}) = \Psi(\mathbf{y}^{-1}\mathbf{x})\ (\mathbf{x}, \mathbf{y} \in G);\tag{7.103}
$$

$$
\pi^{\mathcal{X}}(\tilde{f})\Psi(\mathbf{x}) = \tilde{f}([\mathbf{x}])\Psi(\mathbf{x})\ (\tilde{f} \in C\_0(\mathcal{G}/H)),\tag{7.104}
$$

where we note that *u*<sup>χ</sup> (*y*)ψ satisfies (7.98) if ψ does. Unitarity of *u*<sup>χ</sup> as well as the covariance condition (7.81) are easily checked. In general, we replace (7.103) by

$$
\mu^{\mathcal{X}}(\mathbf{y})\Psi(\mathbf{x}) = \sqrt{\frac{d\mathbf{v}([\mathbf{y}^{-1}\mathbf{x}])}{d\mathbf{v}([\mathbf{x}])}}\Psi(\mathbf{y}^{-1}\mathbf{x}),\tag{7.105}
$$

where *<sup>d</sup>*ν([*y*−1·])/*d*ν([·]) is the Radon–Nikodym derivative of the translated measure *L*∗ *<sup>y</sup>*ν with respect to ν, cf. (B.137), which is well defined because by the assumption of quasi-invariance, *L*∗ *<sup>y</sup>*ν is absolutely continuous with respect to ν (indeed, on this assumption they are even equivalent). Here *L*∗ *<sup>y</sup>*ν(*A*) = ν(*L*−<sup>1</sup> *<sup>y</sup>* (*A*)), *A* ⊂ *G*/*H*.

Physicists do not like the Hilbert space *H*<sup>χ</sup> , preferring a different realization

$$
\tilde{H}^{\mathbb{X}} = L^2(\mathbb{G}/H) \otimes H\_{\mathbb{X}},\tag{7.106}
$$

in which the wave-function ψ is not constrained and one has a clean separation between the (typically) spatial degree of freedom *Q* = *G*/*H* and the internal degree of freedom *H*<sup>χ</sup> . One half of the system of imprimitivity will then be given nicely by

$$
\tilde{\pi}^{\mathcal{X}}(\tilde{f})\tilde{\Psi} = \tilde{f}\tilde{\Psi} \text{ (}\tilde{f} \in \mathcal{C}\_{0}(G/H)\text{)},\tag{7.107}
$$

but this cleanliness comes at the cost of a more complicated formula for ˜*u*<sup>χ</sup> (*y*), as follows. Pick a (measurable) cross-section *s* : *G*/*H* → *G*, i.e., a *right* inverse to the projection *p* : *G* → *G*/*H*, *p*(*x*)=[*x*], in other words, we have

$$p \circ \mathbf{s} = \mathrm{id}\_{G/H}.\tag{7.108}$$

It may not be possible to make *s* continuous, and, crucially, *s* is not a *left* inverse to *p*; instead, there exists a unique function *hs* : *G* → *H* such that *s*◦ *p*(*x*) = *xhs*(*x*), i.e.,

$$h\_s(\mathbf{x}) = \mathbf{x}^{-1} s([\mathbf{x}]).\tag{7.109}$$

Such a cross-section *s* gives rise to a unitary isomorphism

$$\{w\_s: H^{\mathbb{Z}} \to \tilde{H}^{\mathbb{Z}};\tag{7.110}$$

$$
\omega\_s \Psi(q) = \Psi(s(q));\tag{7.111}
$$

$$
\mu\_s^{-1} \Psi(\mathbf{x}) = \mu\_\mathcal{Z}(h\_s(\mathbf{x})) \Psi([\mathbf{x}]),\tag{7.112}
$$

which enables us to move the system of imprimitivity (*u*<sup>χ</sup> ,π<sup>χ</sup> ) to *H*˜ <sup>χ</sup> by defining

$$
\tilde{\mu}^{\mathcal{Z}}(\mathbf{y}) = \mathbf{w}\_s \mu^{\mathcal{Z}}(\mathbf{y}) \mathbf{w}\_s^\* \text{ ( $\mathbf{y} \in G$ )};\tag{7.113}
$$

$$
\tilde{\pi}^{\mathcal{X}}(\tilde{f}) = \le\_s \pi^{\mathcal{X}}(\tilde{f}) \le\_s^\* \left( \tilde{f} \in \mathsf{C}\_0(G/H) \right). \tag{7.114}
$$

This duly leads to (7.107), but instead of (7.105), we obtain the more cumbersome

$$
\tilde{u}^{\mathcal{X}}(\mathbf{y})\tilde{\Psi}(q) = \sqrt{\frac{d\mathbf{v}(\mathbf{y}^{-1}q)}{d\mathbf{v}(q)}}\mu\_{\mathcal{X}}(\mathbf{s}(q)^{-1}\mathbf{y}\mathbf{s}(\mathbf{y}^{-1}q))\tilde{\Psi}(\mathbf{y}^{-1}q),\tag{7.115}
$$

where of course the square root may be omitted if ν is *G*-invariant, as in (7.103). The argument *h* = *s*(*q*)−1*ys*(*y*−1*q*) of *u*<sup>χ</sup> appearing here is called the *Wigner cocycle* (after the physicist who first introduced it in his classification of the irreducible representations of the Poincare group). One may verify that ´ *h* ∈ *H* by applying *p*, which by construction is *G*-equivariant (i.e., *p*(*xy*) = *xp*(*y*)), which gives

$$p(h) = p(\mathbf{s}(q)^{-1} \mathbf{y} \mathbf{s}(\mathbf{y}^{-1} q)) = \mathbf{s}(q)^{-1} \mathbf{y} p(\mathbf{s}(\mathbf{y}^{-1} q)) = \mathbf{s}(q)^{-1} \mathbf{y} \mathbf{y}^{-1} q = \mathbf{s}(q)^{-1} q, \quad \mathbf{y}$$

where in the third step we used (7.108). For any *<sup>x</sup>* <sup>∈</sup> *<sup>G</sup>* we have *<sup>x</sup>*−1[*x*]=[*x*−1*x*]=[*e*], so taking *x* = *s*(*q*) in this computation we find *p*(*h*)=[*e*], which is true iff *h* ∈ *H*.

Given an irreducible system of imprimitivity (*u*˜ <sup>χ</sup> ,π˜ <sup>χ</sup> ), we obtain generalized momentum operators by passing to the associated representation of the Lie algebra g of *<sup>G</sup>* through (5.156) and (7.57), i.e.,

$$\mathfrak{A}^{\mathcal{X}}(A) = i\hbar (\breve{\mu}^{\mathcal{X}})'(A),\tag{7.116}$$

where *<sup>A</sup>* <sup>∈</sup> g, so that, cf. (7.78) - (7.80), we obtain from (5.160) and (7.81):

$$[\tilde{\pi}^{\mathcal{X}}(A), \tilde{\pi}^{\mathcal{X}}(B)] = i\hbar \tilde{\pi}^{\mathcal{X}}([A, B]);\tag{7.117}$$

$$[\tilde{\pi}^{\mathcal{X}}(A), \tilde{\pi}^{\mathcal{X}}(\tilde{f})] = i\hbar \tilde{\pi}^{\mathcal{X}}(\delta\_{\mathcal{A}}\tilde{f});\tag{7.118}$$

$$[\mathfrak{A}^{\mathcal{X}}(f), \mathfrak{A}^{\mathcal{X}}(\mathfrak{g})] = 0,\tag{7.119}$$

where *<sup>A</sup>*,*<sup>B</sup>* <sup>∈</sup> <sup>g</sup> and ˜*<sup>f</sup>* ,*g*˜ <sup>∈</sup> *<sup>C</sup>*0(*Q*) (in fact, these formulae—defined on the right domain—work also for many unbounded functions on *Q*, see below), and δ*<sup>A</sup>* is defined in (3.71). Let us take a look at a few illustrative special cases:


$$
\tilde{H}^j = L^2(\mathbb{R}^3) \otimes H\_j,\tag{7.120}
$$

and using the cross-section *s*(*q*)=(13,*q*) from R<sup>3</sup> to *E*(3) we obtain, from (7.115) with (7.63) - (7.64) and (7.107), the expressions

$$(\tilde{u}^j(R, a))\tilde{\Psi}(q) = D\_j(R)\tilde{\Psi}(R^{-1}(q - a));\tag{7.121}$$

$$
\tilde{\pi}^j(\tilde{f})(\tilde{\Psi}(q) = \tilde{f}(q)\tilde{\Psi}(q). \tag{7.122}
$$

For *j* = 0 this gives the usual quantum theory of a spinless particle:


$$P\_l = -i\hbar \frac{\partial}{\partial q^i},\tag{7.123}$$

where *Pi* = π˜ <sup>0</sup>(*ei*) is defined in terms of the standard basis (*e*1, *e*2, *e*3) of R3, now seen as the Lie algebra of R3.

3. Using the basis (3.66) of the Lie algebra of *SO*(3) ⊂ *E*(3), we obtain the orbital angular momentum operators (which pick up extra terms for *j* > 0):

$$\tilde{\pi}^0(J\_1) = i\hbar \left( q^3 \frac{\partial}{\partial q^2} - q^2 \frac{\partial}{\partial q^3} \right);\tag{7.124}$$

$$\mathfrak{A}^0(J\_2) = i\hbar \left( q^1 \frac{\partial}{\partial q^3} - q^3 \frac{\partial}{\partial q^1} \right);\tag{7.125}$$

$$\tilde{\pi}^0(J\_3) = i\hbar \left( q^2 \frac{\partial}{\partial q^1} - q^1 \frac{\partial}{\partial q^2} \right). \tag{7.126}$$

4. The coordinate functions ˜*f*(*q*) = *q<sup>i</sup>* yield the position operators *Qi* = π˜ <sup>0</sup>(*qi* ):

$$
\mathcal{Q}\_i \Psi(q) = q^i \Psi(q). \tag{7.127}
$$

5. Thus we obtain all the familiar commutation relations like [*Qi*,*Pj*] = *ih*¯δ*i j*, [π˜ <sup>0</sup>(*J*1),π˜ <sup>0</sup>(*J*2)] = *ih*¯π˜ <sup>0</sup>(*J*3), etc., cf. (7.65) - (7.66).

• Let *G* = R act on *Q* = T, which we parametrize by *z* = exp(2π*iq*), *q* ∈ [0,1), by

$$a: \exp(2\pi i q) \mapsto \exp(2\pi i (q+a)),\tag{7.128}$$

so that *<sup>H</sup>* <sup>=</sup> <sup>Z</sup>, with *<sup>H</sup>*<sup>ˆ</sup> <sup>=</sup> <sup>T</sup> under *uz*(*n*) = *<sup>z</sup>n*, *<sup>z</sup>* <sup>∈</sup> <sup>T</sup>, *<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>, cf. (C.349). We parametrize *<sup>H</sup>*<sup>ˆ</sup> by *<sup>z</sup>* <sup>=</sup> exp(*i*θ), <sup>θ</sup> <sup>∈</sup> [0,2π), so that (with slight abuse of notation) *u*<sup>θ</sup> (*n*) = *ein*<sup>θ</sup> . In the second description (i.e. the one of the physicists) we have

$$\mathcal{H}^{\theta} = L^2(\mathbb{T}) = L^2(0, 1), \tag{7.129}$$

where topology of *Q* is lost for the moment. Using the cross-section

$$s\left(e^{2\pi i q}\right) = q,\tag{7.130}$$

where *q* ∈ [0,1), we obtain

$$
\tilde{u}^{\theta}(a)\tilde{\Psi}(q) = e^{\tilde{\mu}(a,q)\theta}\tilde{\Psi}(q-a+n(a,q)),\tag{7.131}
$$

where *n*(*a*,*q*) ∈ Z is the unique integer such that *q* − *a* + *n*(*a*,*q*) ∈ [0,1). The corresponding momentum operator is formally given by the usual expression *P* = −*ih*¯∂/∂*q*, cf. (7.123), which appears to be independent of θ (since for any *q* ∈ (0,1) and *a* small enough we have *n*(*a*,*q*) = 0), but in fact the θ-dependence is in its domain, which can be shown to consist of the subspace of the Sobolev space *H*1(0,1)—i.e. the closure of*C*∞([0,1]) in the inner product (5.318) adapted to *<sup>L</sup>*2(0,1), which implies *<sup>H</sup>*1(0,1) <sup>⊂</sup> *<sup>C</sup>*([0,1])—whose elements satisfy

$$
\Psi(1) = e^{-i\theta} \Psi(0). \tag{7.132}
$$

To see this, we recall that

$$P\tilde{\boldsymbol{\Psi}} = i\hbar \lim\_{\varepsilon \to 0} \left( \frac{\tilde{\boldsymbol{\mu}}^{\boldsymbol{\theta}}(\varepsilon)\tilde{\boldsymbol{\mu}} - \tilde{\boldsymbol{\Psi}}}{\varepsilon} \right), \tag{7.133}$$

where the limit is taken in the *L*2-norm, so that we need existence of

$$\lim\_{\varepsilon \to 0} \varepsilon^{-2} \int\_0^1 dq \, |e^{in(a,q)\theta} \Psi(q - \varepsilon + n(\varepsilon, q)) - \Psi(q)|^2.$$

For 0 < *q* < ε we have *n*(ε,*q*) = 1, whereas for ε < *q* < 1 we have *n*(ε,*q*) = 0, so it is convenient to split the integral as a sum of <sup>ε</sup> <sup>0</sup> and <sup>1</sup> <sup>ε</sup> . The second term enforces the existence of derivatives in the *L*2-sense (which in turn makes ψ˜ continuous on [0,1]) and is unproblematic, but the first requires the existence of

$$\lim\_{\varepsilon \to 0} \mathfrak{e}^{-2} \int\_0^{\varepsilon} dq \, |e^{i\theta} \, \Psi(q - \mathfrak{e} + 1) - \Psi(q)|^2.$$

This strange expression, then, enforces the boundary condition (7.132). In this case there is no single position operator, but the algebra *C*(T) plays its role.

#### 7.6 Representations of semi-direct products

The case *Q* = *G*/*H* also provides the key for the general case, as long as the *G*-action on *Q* is *regular*, cf. Theorem 7.7. In that case, the construction of the irreducible system of imprimitivity (*u*(*G*),π(*C*0(*Q*))) corresponding to a pair (O,*u*<sup>χ</sup> (*H*)), where O is a *G*-orbit in *Q*, requires no new ideas: we have O ∼= *G*/*H*, and hence *u* = *u*<sup>χ</sup> and <sup>π</sup> <sup>=</sup> <sup>π</sup><sup>χ</sup> as described in §7.5 (where the function ˜*<sup>f</sup>* in formulae like (7.104) or (7.114), which in these expression was defined on *G*/*H*, should be seen as the restriction of ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*Q*) to <sup>O</sup> <sup>⊂</sup> *<sup>Q</sup>*). An important application of this construction is the representation theory of *regular semi-direct products L V* (cf. §7.3), where regularity means that the *dual L*-action on *V*∗ is regular; this action is given by

$$
\mathcal{A} \cdot \boldsymbol{\theta}(\mathbf{v}) = \boldsymbol{\theta}(\mathcal{X}^{-1} \cdot \mathbf{v}) \ (\mathcal{X} \in L, \boldsymbol{\theta} \in V^\*, \mathbf{v} \in V). \tag{7.134}
$$

Theorem 7.9. *Up to unitary equivalence, the irreducible unitary representations of a regular semi-direct product G* = *LV are classified by pairs* (O,σ)*, where* O *is an L-orbit in V*<sup>∗</sup> *and* σ *is an element of the unitary dual of the stabilizer L*<sup>0</sup> ⊂ *L of an arbitrary point* θ<sup>0</sup> ∈ O*. The corresponding representation u*˜ (O,σ) (*G*) *may be realized from an irreducible representation u*<sup>σ</sup> *of L*<sup>0</sup> *on a Hilbert space H*<sup>σ</sup> *combined with a cross-section s*: *L*/*L*<sup>0</sup> → *L of the canonical projection p* : *L* → *L*/*L*0*, namely through*

$$
\tilde{H}^{(\ell,\sigma)} = L^2(L/L\_0) \otimes H\_{\sigma};\tag{7.135}
$$

$$\tilde{u}^{(\mathcal{O},\sigma)}(\boldsymbol{\lambda},\boldsymbol{v})\Psi(\boldsymbol{\theta}) = e^{i\boldsymbol{\theta}(\boldsymbol{v})}u\_{\sigma}(\mathbf{s}(\boldsymbol{\theta})^{-1}\boldsymbol{\lambda}\mathbf{s}(\boldsymbol{\lambda}^{-1}\boldsymbol{\theta}))\Psi(\boldsymbol{\lambda}^{-1}\boldsymbol{\theta}).\tag{7.136}$$

*Proof.* Let *u* be a unitary representation of *G*. This implies

$$
\mu(\mathcal{X})\mu(\nu)\mu(\mathcal{X}^{-1}) = \mu(\mathcal{X}\cdot\nu),
\tag{7.137}
$$

in which λ ≡ (λ,0) and *v* ≡ (*e*, *v*). Since *V* ⊂ *G* is abelian, we have*C*∗(*V*) ∼=*C*0(*V*∗) by the Fourier transform (cf. Theorem C.109 in §C.15), which here is given by (7.44) - (7.45), with *A v*. Hence the representation *u* (*C*∗(*V*)) defined by *u*(*V*) via (5.172), seen as a representation of *C*0(*V*∗) via the Fourier transform, is given by

$$
\mu^{\int}(f) = (2\pi)^{-n} \int\_{V \times V^\*} d^n \nu d^n \theta \, e^{i\theta(\nu)} f(\theta) \mu(\nu). \tag{7.138}
$$

Using invariance of the measure *dnvdn*θ under the joint transformation (*v*,θ) (<sup>λ</sup> · *<sup>v</sup>*,<sup>λ</sup> · <sup>θ</sup>), from (7.137) we obtain, for *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*V*∗) in the image of <sup>ˇ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*V*),

$$\begin{split} \mu(\boldsymbol{\lambda})\boldsymbol{\mu}^{\boldsymbol{f}}(\boldsymbol{f})\boldsymbol{\mu}(\boldsymbol{\lambda})^{\*} &= (2\pi)^{-n} \int\_{V\times V^{\*}} d^{n}\boldsymbol{v} d^{n}\boldsymbol{\theta} \, e^{i\boldsymbol{\theta}(\boldsymbol{\nu})} f(\boldsymbol{\theta}) \boldsymbol{\mu}(\boldsymbol{\lambda}\cdot\boldsymbol{\nu}) \\ &= (2\pi)^{-n} \int\_{V\times V^{\*}} d^{n}\boldsymbol{v} \, d^{n}\boldsymbol{\theta} \, e^{i(\boldsymbol{\lambda}\cdot\boldsymbol{\theta})(\boldsymbol{\lambda}\cdot\boldsymbol{\nu})} f(\boldsymbol{\lambda}^{-1}\cdot\boldsymbol{\lambda}\cdot\boldsymbol{\theta}) \boldsymbol{\mu}(\boldsymbol{\lambda}\cdot\boldsymbol{\nu}) \\ &= (2\pi)^{-n} \int\_{V\times V^{\*}} d^{n}\boldsymbol{v} \, d^{n}\boldsymbol{\theta} \, e^{i\boldsymbol{\theta}(\boldsymbol{\nu})} f(\boldsymbol{\lambda}^{-1}\cdot\boldsymbol{\theta})\boldsymbol{\mu}(\boldsymbol{\nu}) \\ &= \boldsymbol{\mu}^{\boldsymbol{f}}(L\_{\boldsymbol{\lambda}}f). \end{split} \tag{7.139}$$

Consequently, a unitary representation *u*(*L V*) defines a system of imprimitivity (*u*(*L*),*u* (*C*0(*V*∗))), and *vice versa*, since any pair of representations (*u*(*L*),*u*(*V*)) that satisfies (7.137) gives rise to a representation *u*(*G*) by *u*(λ, *v*) = *u*(*v*)*u*(λ).

Now apply Theorem 7.7 with *G L* and *Q V*∗. All we need in order to obtain (7.135) - (7.136) from (7.106) and (7.107) - (7.115) is to find the representation *u*(*V*) that induces the representation *u* (*C*0(*V*∗)) given by (7.107), namely

$$u(\upsilon)\Psi(\theta) = e^{-i\theta(\upsilon)}\Psi(\theta),\tag{7.140}$$

as is easily checked from (7.138). -

In view of this, we have a remarkable group–groupoid C\*-algebra isomorphism

$$\mathcal{C}^\*(L \ltimes V) \cong \mathcal{C}^\*(L \ltimes V^\*),\tag{7.141}$$

where the left-hand side is just the C\*-algebra of the *group LV*, whereas the righthand side is the C\*-algebra of the action group*oid LV*∗ relative to (7.134). Also, a computation shows that the same formulae (7.135) - (7.136) are obtained if, given θ<sup>0</sup> ∈ *V*<sup>∗</sup> and hence given *L*<sup>0</sup> as its stabilizer, we define a subgroup *H* ⊂ *G* by

$$H = L\_0 \ltimes V,\tag{7.142}$$

and induce from the representation *u*(θ0,σ) of *H* defined by

$$
\mu\_{(\theta\_0, \sigma)}(\mathcal{X}, \nu) = e^{i\theta\_0(\upsilon)} \mu\_{\sigma}(\mathcal{X}). \tag{7.143}
$$

We briefly discuss four basic examples from physics, each of which is easily seen to be regular. We write *a* instead of *v* in (λ, *v*) ∈ *G* so as to emphasize the "spatial" character of *V*, whereas *V*∗ is labeled by a dual "momentum" variable *p*.

• *<sup>G</sup>* <sup>=</sup> *<sup>E</sup>*(2) = *SO*(2) R2, defined like *<sup>E</sup>*(3), i.e., with respect to the usual action of *SO*(2) on R<sup>2</sup> (this group will play a role in the representation theory of the Poincare-group). We find the same action of ´ *SO*(2) on (R2)<sup>∗</sup> = R2, so that the orbits are <sup>O</sup><sup>0</sup> <sup>=</sup> {0} with *<sup>G</sup>*<sup>0</sup> <sup>=</sup> *SO*(2) and <sup>O</sup>*<sup>r</sup>* <sup>=</sup> {(*x*, *<sup>y</sup>*) <sup>∈</sup> <sup>R</sup><sup>2</sup> <sup>|</sup> *<sup>x</sup>*<sup>2</sup> <sup>+</sup>*y*<sup>2</sup> <sup>=</sup> *<sup>r</sup>*2} for *r* > 0, with *Gr* = {*e*}. Thus the Hilbert spaces and representations are given by

$$
\tilde{H}^{(0,n)} = \mathbb{C};\tag{7.144}
$$

$$
\hat{u}^{(0,n)}(\lambda, a) = e^{2\pi i n \lambda};\tag{7.145}
$$

$$
\tilde{H}' = L^2(0, 1); \tag{7.146}
$$

$$\tilde{\mu}^r(\lambda, a)\tilde{\Psi}(p) = e^{ir(a\_1\cos p' + a\_2\sin p')}\Psi(p - \lambda \vert \text{mod } 1),\tag{7.147}$$

where *<sup>n</sup>* <sup>∈</sup> <sup>Z</sup>, <sup>λ</sup> <sup>∈</sup> [0,1), *<sup>p</sup>* <sup>∈</sup> (0,1), and *<sup>p</sup>* <sup>=</sup> <sup>2</sup><sup>π</sup> *<sup>p</sup>*. In the first case <sup>R</sup><sup>2</sup> <sup>⊂</sup> *<sup>E</sup>*(2) is represented trivially, whereas in the second the *r*-dependence of the representation lies entirely in R<sup>2</sup> (since *H*˜ *<sup>r</sup>* and ˜*u<sup>r</sup>* (λ,0) are evidently independent of *r*). The projective representations of *G* are of considerable interest, too, cf. §5.10.

Lemma 7.10. *If G* <sup>=</sup> *SO*(*p*,*q*) R*p*+*<sup>q</sup> (p* <sup>&</sup>gt; <sup>0</sup>,*<sup>q</sup>* <sup>≥</sup> <sup>0</sup>*), then H*2(g,R) = <sup>0</sup>*.*

Here *SO*(*p*,*q*) is the subgroup of *SLp*+*q*(R*p*+*q*) whose elements leave the form

$$\mathbf{x}^2 = \mathbf{x}\_1^2 + \dots + \mathbf{x}\_p^2 - (\mathbf{x}\_{p+1}^2 + \dots + \mathbf{x}\_{p+q}^2)^T$$

invariant; the best-known example is the (proper) Lorentz group *SO*(3,1), see below. This lemma may be proved by a straightforward but lengthy computation. By Theorem 5.59, the projective unitary representations of *G* then correspond to the ordinary unitary representations of the universal covering

$$
\tilde{G} = \mathbb{R} \ltimes \mathbb{R}^2,\tag{7.148}
$$

where <sup>R</sup> acts on <sup>R</sup><sup>2</sup> through the covering projection <sup>π</sup>˜ : <sup>R</sup> <sup>→</sup> *SO*(2) = <sup>R</sup>/Z, cf. Theorem 5.41 (with *D* Z). This changes the expressions (7.144) - (7.147) into

$$\mathcal{H}^{(0,s)} = \mathbb{C};\tag{7.149}$$

$$
\tilde{u}^{(0,s)}(\lambda, a) = e^{i s \lambda};\tag{7.150}
$$

$$
\tilde{H}^{(r,\theta)} = L^2(0,1);\tag{7.151}
$$

$$\tilde{\mu}^{(r,\theta)}(\lambda,a)\tilde{\Psi}(p) = e^{ir(a\_1\cos p' + a\_2\sin p')}e^{in(\lambda,p)\theta}\tilde{\Psi}(p-\lambda+n(\lambda,p)), (7.152)\tilde{\Psi}(p)$$

where λ ∈ R, *s* ∈ R, θ ∈ [0,2π), *p* ∈ (0,1), and *n*(λ, *p*) is defined as in (7.131).

• *<sup>G</sup>* <sup>=</sup> *<sup>E</sup>*(3) = *SO*(3)R3, as before with the defining action of *SO*(3). The *SO*(3) orbits in (R3)<sup>∗</sup> = R<sup>3</sup> are spheres *S*<sup>2</sup> *<sup>r</sup>* ∼= *SO*(3)/*SO*(2) with radius *r* > 0, as well as the origin (*r* = 0) with stabilizer *SO*(3), so that for the Hilbert spaces we obtain

$$
\tilde{H}^{(0,j)} = \mathbb{C}^{2j+1};\tag{7.153}
$$

$$
\tilde{H}^{(r,n)} = L^2(\mathbb{S}^2);\tag{7.154}
$$

where *j* = 0,1,... labels the unitary irreducible representations of *SO*(3) on *Hj* = <sup>C</sup><sup>2</sup> *<sup>j</sup>*+1, whereas *<sup>n</sup>* <sup>∈</sup> <sup>Z</sup> labels the irreducible representations of *SO*(2) on <sup>C</sup> (we write *<sup>S</sup>*<sup>2</sup> <sup>≡</sup> *<sup>S</sup>*<sup>2</sup> <sup>1</sup>). In the second case, the representation *<sup>u</sup>*(*r*,*n*) of *SO*(3) <sup>⊂</sup> *<sup>E</sup>*(3) depends explicitly on *n* through the Wigner cocycle; for *n* = 0 we simply obtain

$$
\tilde{u}^{(r,0)}(R,a)\Psi(p) = e^{irp\cdot a}\Psi(R^{-1}p).\tag{7.155}
$$

For *<sup>n</sup>* <sup>=</sup> 0 we just give a formula for ˜*u*(*r*,*n*) (*R*,*a*) in case that *R* is a rotation around the *z*-axis and *a* = 0; this is enough to make the point. To this end we parametrize *SO*(3) by the well-known Euler angles, i.e., in terms of the matrices *Ji*, cf. (3.66),

$$R(\phi, \theta, \alpha) = e^{\phi J\_3} e^{\theta J\_2} e^{\alpha J\_3},\tag{7.156}$$

and write *<sup>q</sup>* <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> as *<sup>q</sup>* = (φ,θ) = *<sup>R</sup>*(φ,θ,0)*e*<sup>3</sup> with *<sup>e</sup>*<sup>3</sup> = (0,0,1) (the spherical coordinates of *q* are (φ − <sup>1</sup> <sup>2</sup>π,θ)). This also provides *S*<sup>2</sup> with an *SO*(3)-invariant measure *<sup>d</sup>*ν(φ,θ) = *<sup>d</sup>*φ*d*<sup>θ</sup> sinθ. A convenient choice of *<sup>s</sup>* : *<sup>S</sup>*<sup>2</sup> <sup>→</sup> *SO*(3) is

$$s(\phi, \theta) = R(\phi, \theta, -\phi),\tag{7.157}$$

in which case we simply obtain, writing *Rz*(α) = *R*(α,0,0),

$$
\tilde{u}^{(r,n)}(R\_{\varepsilon}(\alpha),0)\tilde{\Psi}(\phi,\theta) = e^{in\alpha}\tilde{\Psi}(\phi-\alpha,\theta). \tag{7.158}
$$

The universal covering group of *E*(3) is

$$E(\mathfrak{Z}) = SU(2) \ltimes \mathbb{R}^3,\tag{7.159}$$

where *SU*(2) = *SO*(3) acts on R<sup>3</sup> through its covering projection π˜ onto *SO*(3), as in the previous case. By Theorem 5.59 and Lemma 7.10, the projective unitary irreducible representations of *E*(3) are given by the unitary irreducible representations of *SU*(2) R3. This obviously leads to additional half-integral values for *j* in (7.153), since this number now labels the unitary irreducible representations of *SU*(2). As to *<sup>n</sup>* in (7.154), the subgroup *<sup>H</sup>* <sup>⊂</sup> *SU*(2) that stabilizes (0,0,*r*) <sup>∈</sup> *<sup>S</sup>*<sup>2</sup> *r* consists of all matrices *uz* <sup>=</sup> diag(*z*,*z*), where *<sup>z</sup>* <sup>∈</sup> <sup>T</sup>, so *<sup>H</sup>* <sup>∼</sup><sup>=</sup> <sup>T</sup> and hence *<sup>H</sup>*<sup>ˆ</sup> <sup>=</sup> <sup>Z</sup> under *uz* → *zm*, *<sup>m</sup>* <sup>∈</sup> <sup>Z</sup>. We now recall from the proof of Proposition 5.5 that

$$u = \cos(\theta/2) \cdot 1\_2 + i \sin(\theta/2) \mathbf{u} \cdot \boldsymbol{\sigma} \in SU(2),\tag{7.160}$$

where <sup>u</sup> is a unit vector in <sup>R</sup>3, projects to <sup>π</sup>˜(*u*) = *<sup>R</sup>*<sup>θ</sup> (u) <sup>∈</sup> *SO*(3), i.e., the rotation around u by an angle θ. Parametrizing *z* = cos(α/2) + *i*sin(α/2), α ∈ [0,4π), therefore gives π˜(*uz*) = exp(α*J*3). Besides (7.157), we now also need a crosssection *s* : *S*<sup>2</sup> *<sup>r</sup>* → *SU*(2), for which the above analysis suggests we take

$$s(\phi, \theta) = \mu^{(3)}(\phi)\mu^{(2)}(\theta)\mu^{(3)}(-\phi);\tag{7.161}$$

$$u^{(2)}(
\theta) \equiv \cos(\frac{1}{2}\theta) \cdot 1\_2 + i \sin(\frac{1}{2}\theta) \cdot \sigma\_2;\tag{7.162}$$

$$\mu^{(3)}(\phi) \equiv \cos(\phi/2) \cdot 1\_2 + i \sin(\phi/2) \cdot \sigma\_3;\tag{7.163}$$

note that *uz* = *u*(3) (α). A calculation similar to the one leading to (7.158) gives

$$
\tilde{u}^{(r,m)}(u\_\varepsilon,0)\Psi(\phi,\theta) = e^{im\alpha/2}\Psi(\phi-\alpha,\theta). \tag{7.164}
$$

Comparing (7.158) and (7.164), we see that if *m* is even, then *n* = *m*/2 (of course, by convention we may replace *m*/2 in (7.164) by *n* on the understanding that *n* may now be half-integral). If *m* is odd, choosing α = 2π we famously obtain

$$
\tilde{\mu}^{(r,m)}(-1\_2,0)\tilde{\Psi} = -\tilde{\Psi}.\tag{7.165}
$$

More generally, if we take a closed path *t* → *R*2π*t*(u), *t* ∈ [0,1] in *SO*(3), which starts and ends at 13, and lift it (with respect to the covering projection π˜ : *SU*(2) → *SO*(3)) to a path *t* → *u*(*t*) ≡ cos(π*t*) + *i*sin(π*t*)u · σ in *SU*(2), which now starts at 12 and ends at −12, then the corresponding representation *u*˜ (*r*,*m*) (*u*(*t*),0) takes the wave-function ψ˜ to itself if *m* is even, whereas it takes ψ˜ to −ψ˜ whenever *m* is odd (this is an embryonic version of the connection between spin and statistics, fully realized only in quantum field theory).

• *<sup>G</sup>* <sup>=</sup> *<sup>L</sup>*R3+1, the *Poincare group ´* , where the *Lorentz group <sup>L</sup>* <sup>=</sup> *<sup>O</sup>*(3,1) consists of all real 4×4 matrices that leave the indefinite quadratic form

$$\mathbf{x}^2 = \mathbf{x}\_0^2 - \mathbf{x}\_1^2 - \mathbf{x}\_2^2 - \mathbf{x}\_3^2 \tag{7.166}$$

invariant; in this context the standard coordinates on R<sup>4</sup> are labeled as(*x*0, *x*1, *x*2, *x*3). The Lorentz group has four connected components, which may be identified by the (independent) conditions det(λ) = ±1 and ±λ<sup>00</sup> ≥ 1. For simplicity we restrict ourselves to the connected component *L*↑ <sup>+</sup> of the identity, in which det(λ) = 1 and λ<sup>00</sup> ≥ 1. This group is called the *proper orthochronous Lorentz group*, which in turn defines the *proper orthochronous Poincare group ´ P*↑ <sup>+</sup> = *L*<sup>↑</sup> <sup>+</sup> R4. Writing *p*<sup>2</sup> = *p*<sup>2</sup> <sup>0</sup> <sup>−</sup> *<sup>p</sup>*<sup>2</sup> <sup>1</sup> <sup>−</sup> *<sup>p</sup>*<sup>2</sup> <sup>2</sup> <sup>−</sup> *<sup>p</sup>*<sup>2</sup> <sup>3</sup>, the *L*<sup>↑</sup> <sup>+</sup>-orbits in (R4)<sup>∗</sup> = R<sup>4</sup> are seen to be:


Here the stabilizers *L*<sup>0</sup> are found by taking the reference points (±*m*,0,0,0) in case 2, (±1,0,0,−1) in case 3, and (0,0,0,*m*) in case 4. The physically relevant cases are probably O<sup>+</sup> *<sup>m</sup>*<sup>2</sup> and <sup>O</sup><sup>+</sup> <sup>0</sup> . We pass straight to the universal covering group

$$\mathcal{P}\_{+}^{\uparrow} = \mathcal{S}\mathcal{L}(\mathcal{D}, \mathbb{C}) \ltimes \mathbb{R}^{4},\tag{7.167}$$

where the covering projection π˜ : *SL*(2,C) → *L*<sup>↑</sup> <sup>+</sup> is given analogously to the case (5.46). We again start from the four matrices (σ0,σ1,σ2,σ3) in (5.42), and note:


Taking *<sup>a</sup>* <sup>=</sup> <sup>∑</sup><sup>μ</sup> *<sup>x</sup>*μσμ , it follows that for ˜ <sup>λ</sup> <sup>∈</sup> *SL*(2,C) and *<sup>x</sup>* <sup>∈</sup> <sup>R</sup><sup>4</sup> there must be <sup>λ</sup> <sup>∈</sup> *<sup>O</sup>*(3,1) such that ˜ <sup>λ</sup> <sup>∑</sup><sup>μ</sup> *<sup>x</sup>*μσμ ˜ λ<sup>∗</sup> = ∑<sup>μ</sup> (λ · *x*)μσμ . By continuity and the fact that *SL*(2,C) is connected it follows that in fact λ ∈ *L*<sup>↑</sup> <sup>+</sup>, so we put π˜(λ) = λ. As for (5.46), the kernel is ker(π˜) = Z<sup>2</sup> = {±12}. This enlarges the stabilizers:


On the one hand, this classification is a triumph of mathematical physics, but on the other hand, it fails to single out which cases actually occur in nature: as far as we know, these are spin *j* = 0 and *j* = <sup>1</sup> <sup>2</sup> and helicity *n* = ±1 and *n* = ±2.

• *<sup>G</sup>* <sup>=</sup> *<sup>E</sup>*(3) R4, the Galilei group, defined via the following *<sup>E</sup>*(3)-action on <sup>R</sup>4:

$$(R, \mathbf{v}) : (a\_0, \mathbf{a}) \mapsto (a\_0, R\mathbf{a} + a\_0\mathbf{v}). \tag{7.168}$$

Note that <sup>v</sup> is physically interpreted as a velocity, whereas earlier <sup>a</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> <sup>⊂</sup> *<sup>E</sup>*(3) was a position variable. This is clear from the defining *G*-action on R4, given by

$$(R, \mathbf{v}, a\_0, \mathbf{a}) : (t, \mathbf{x}) \mapsto (t + a\_0, R\mathbf{x} + \mathbf{a} + t\mathbf{v}),\tag{7.169}$$

which in fact determines the action (7.168). Either way, we obtain the group law

$$(\mathbf{R}, \mathbf{v}, a\_0, \mathbf{a}) \cdot (\mathbf{R}', \mathbf{v}', a\_0', \mathbf{a}') = (R\mathbf{R}', \mathbf{v} + R\mathbf{v}', a\_0 + a\_0', \mathbf{a} + R\mathbf{a}' + a\_0'\mathbf{v}). \tag{7.170}$$

We therefore see that the role of the Lorentz group *SO*(3,1) is now played by the Euclidean group *E*(3). Since from (7.170) the inverse is found to be

$$(\mathbf{R}, \mathbf{v}, a\_0, \mathbf{a})^{-1} = (\mathbf{R}^{-1}, -\mathbf{R}^{-1}\mathbf{v}, -a\_0, -\mathbf{R}^{-1}(\mathbf{a} - a\_0\mathbf{v})),\tag{7.171}$$

the dual *E*(3)-action on (R4)<sup>∗</sup> ∼= R<sup>4</sup> is given (in non-relativistic notation) by

$$(R, \mathbf{v}) : (E, \mathbf{p}) \mapsto (E - \langle \mathbf{v}, R\mathbf{p} \rangle, R\mathbf{p}). \tag{7.172}$$

Hence the dual *<sup>E</sup>*(3)-orbits in <sup>R</sup><sup>4</sup> are labeled by *<sup>E</sup>* <sup>∈</sup> <sup>R</sup> and *<sup>r</sup>* <sup>&</sup>gt; 0, as follows:

$$\mathcal{O}\_E = \{(E, \mathbf{0})\};\tag{7.173}$$

$$\mathcal{O}\_{(r)} = \{(E, \mathbf{p}), E \in \mathbb{R}, ||\mathbf{p}|| = r\}. \tag{7.174}$$

The representations of *G* corresponding to the first type are basically the representations of *E*(3), whereas in the second case the stability group of say (0,0,0,*r*) is isomorphic to *E*(2). None of the ensuing induced representations of *G* reproduces some recognizable version of non-relativistic quantum mechanics, for which we need to pass to projective representations of *G*. These may be found from Theorem 5.62, which here applies in full glory, since *<sup>H</sup>*2(g,R) <sup>=</sup> 0. A (lengthy) computation shows that *<sup>H</sup>*2(g,R) has a single generator

$$\langle \Phi((M, \mathbf{v}, a\_0, \mathbf{a}), (M', \mathbf{v}', a\_0', \mathbf{a}')) \rangle = \langle \mathbf{v}, \mathbf{a}' \rangle - \langle \mathbf{v}', \mathbf{a} \rangle,\tag{7.175}$$

where *<sup>M</sup>* <sup>∈</sup> so(3), and (v,*a*0,a) <sup>∈</sup> <sup>R</sup><sup>3</sup> <sup>×</sup>R<sup>4</sup> <sup>⊂</sup> g <sup>=</sup> so(3)⊕R<sup>3</sup> <sup>⊕</sup>R<sup>4</sup> are identified with the corresponding Lie group elements. Following the procedure culminating in Theorem 5.62, the central extension *G*ˇ is found to be (cf. (7.159) and (5.46))

$$
\check{G} = \dot{E}(\bar{\mathfrak{J}}) \ltimes \mathbb{R}^{\mathcal{S}}, \tag{7.176}
$$

where, writing π˜(*u*) ≡ *R*(*u*), the covering group *E* \$(3) acts on R<sup>5</sup> through

$$(\mathbf{a}, \mathbf{v}) : (a\_0, \mathbf{a}, c) \mapsto (a\_0, \mathcal{R}(\mathbf{u})\mathbf{a} + a\_0 \mathbf{v}, c + \frac{1}{2}a\_0 ||\mathbf{v}||^2 + \langle \mathbf{v}, \mathcal{R}(\mathbf{u})\mathbf{a} \rangle). \tag{7.177}$$

Consequently, writing ˜*x* = (*R*,v,*a*0,a), for the group law in *G*ˇ we obtain

$$(\tilde{\mathbf{x}}, c) \cdot (\tilde{\mathbf{x}}', c') = (\tilde{\mathbf{x}} \cdot \tilde{\mathbf{x}}', c + c' + \langle \mathbf{v}, R(\mathbf{u}) \mathbf{a}' \rangle + \frac{1}{2} a\_0' ||\mathbf{v}||^2). \tag{7.178}$$

Eq. (7.177) implies the following dual *E* \$(3)-action on (R5)<sup>∗</sup> = R5:

$$(\mu, \mathbf{v}) : (E, \mathbf{p}, m) \mapsto (E - \langle \mathbf{v}, R(\mu)\mathbf{p} \rangle + \frac{1}{2}m||\mathbf{v}||^2, R(\mu)\mathbf{p} - m\mathbf{v}, m). \tag{7.179}$$

This time, the *E* \$(3)-orbits in R<sup>5</sup> are:


$$\text{3. } \mathcal{O}\_{U,m} = \{ (E, \mathbf{p}, m) \mid E - E\_{\mathbf{p}} = U \} \text{ (} m \in \mathbb{R} \backslash \{ 0 \}, U \in \mathbb{R} \text{), with stabilizer } SU(2).$$

Here *E*(2) ⊂ *E* \$(3) is a double cover of *E*(2), like the subgroup of *SL*(2,C) stabilizing the point (1,0,0,1) <sup>∈</sup> <sup>R</sup><sup>4</sup> in the theory of the Poincare-group. This ´ time we take any point (*E*,0,0,*r*,0) <sup>∈</sup> <sup>R</sup>5, which is stabilized by pairs (*u*,v) <sup>∈</sup> *E* \$(3) for which *R*(*u*) is a rotation around the *z*-axis and v = (*v*1, *v*2,0); the image of these pairs in *<sup>E</sup>*(3) is *<sup>E</sup>*(2) = *SO*(2) R2, where *SO*(2) <sup>⊂</sup> *SO*(3) consists of rotations around the *z*-axis and R<sup>2</sup> is the *x*-*y* plane. In the third case we write *<sup>E</sup>*<sup>p</sup> <sup>=</sup> p2/2*<sup>m</sup>* and take (*U*,0,*m*), whose stabilizer in *<sup>E</sup>*(3) is evidently *SO*(3).

Thus we have massless as well as massive particles both in relativistic and in nonrelativistic quantum physics. The simplest case of all is formed by massive nonrelativistic particles, which correspond to the orbits O*U*,*<sup>m</sup>* above, supplemented with a spin *j* labelling the underlying irreducible representation *Dj* of *SU*(2). Such orbits are diffeomorphic to <sup>R</sup><sup>3</sup> under the identification (*<sup>U</sup>* <sup>+</sup> *<sup>E</sup>*p,p,*m*) <sup>↔</sup> <sup>p</sup>, and a convenient choice of the cross-section *s* : O*U*,*<sup>m</sup>* → *E* \$(3) is *s*(p)=(12,−p/*m*), since in that case the Wigner cocycle simply becomes *s*(p)−1(*u*,v)*s*((*u*,v)−1p) = *u*. Since different values of *U* turn out to give equivalent representations of *G*ˇ (in the sense explained at the end of §5.10), we take *U* = 0, and eqs. (7.135) - (7.136) become

$$
\mathcal{H}^{m,j} = L^2(\mathbb{R}^3) \otimes H\_j;\tag{7.180}
$$

$$\tilde{u}^{m,j}(\boldsymbol{\mu}, \mathbf{v}, a\_0, \mathbf{a})\tilde{\Psi}(\mathbf{p}) = e^{i(a\_0 E\_{\mathbf{p}} + \langle \mathbf{a}, \mathbf{p} \rangle)} D\_j(\boldsymbol{\mu}) \tilde{\Psi}(\mathbf{R}(\boldsymbol{\mu})^{-1}(\mathbf{p} + m\mathbf{v})). \quad (7.181)$$

Here *L*2(R3) simply carries Lebesgue measure *d*3p, which is *E* \$(3)-invariant.

The massive relativistic case is slightly more involved: we again have O<sup>+</sup> *<sup>m</sup>* ∼= R<sup>3</sup> under (ωp,p) <sup>↔</sup> <sup>p</sup>, where <sup>ω</sup><sup>p</sup> <sup>=</sup> p<sup>2</sup> <sup>+</sup>*m*2, but the Lorentz-invariant measure on O<sup>+</sup> *<sup>m</sup>* is *<sup>d</sup>*3p/ωp. For each <sup>p</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> there is a unique boost *<sup>b</sup>*<sup>p</sup> <sup>∈</sup> *<sup>L</sup>*<sup>↑</sup> <sup>+</sup> that maps (*m*,0,0,0) to (ωp,p), with pre-image *b*˜<sup>p</sup> in *SL*(2,C), so we take *s*(p) = *b*˜p. The Hilbert space is (*mutatis mutandis*) still given by (7.180), but instead of (7.181) we now obtain

$$(\tilde{\mu}^{m,j}(\tilde{\lambda}), \mathbf{a})\tilde{\Psi}(\mathbf{p}) = e^{i(a\_0 a\_{\mathbf{p}} - \langle \mathbf{a}, \mathbf{p} \rangle)} D\_j(\tilde{b}\_{\mathbf{p}}^{-1} \tilde{\lambda} \tilde{b}\_{\lambda^{-1} \mathbf{p}}) \tilde{\Psi}(\lambda^{-1} \mathbf{p}), \tag{7.182}$$

where *a* = (*a*0,a), ˜ λ ∈ *SL*(2,C), and λ ∈ *L*<sup>↑</sup> <sup>+</sup> the image of ˜ λ under the covering projection. We leave the corresponding formulae for the massless case to the reader.

#### 7.7 Quantization and permutation symmetry

Another interesting application of the quantization theory developed in this chapter is to *indistinghuishable particles*. Since all elementary particles come in families of indistinghuishable sorts (such as electrons, photons, . . . ), this topic is obviously of fundamental importance to physics. It is also puzzling, since (as we shall see) mathematically one expects more possibilities than those realized in Nature (namely bosons and fermions). This topic is also interesting philosophically, because it appears to be a testing ground for Leibniz's *Principle of the Identity of Indiscernibles* (PII), which states that two different objects cannot have exactly the same properties (in other words, two objects that have exactly the same properties must be identical).

After a period of confusion but growing insight, involving some of the greatest physicists such as Planck, Einstein, Ehrenfest, Fermi, and especially Heisenberg, the modern point of view on quantum statistics was introduced by Dirac.

Using modern notation, and abstracting from his specific example (which involved electronic wave-functions), Dirac's argument is as follows. Let *H* be the Hilbert space of a single quantum system, called a *particle* in what follows. The two-fold tensor product *<sup>H</sup>*<sup>2</sup> <sup>≡</sup> *<sup>H</sup>* <sup>⊗</sup> *<sup>H</sup>* then describes two distinguishable copies of this particle. The permutation group <sup>S</sup><sup>2</sup> on two objects, with nontrivial element (12), acts on the state space *<sup>H</sup>*<sup>2</sup> by linear extension of *<sup>u</sup>*(12)ψ<sup>1</sup> <sup>⊗</sup> <sup>ψ</sup><sup>2</sup> <sup>=</sup> <sup>ψ</sup><sup>2</sup> <sup>⊗</sup> <sup>ψ</sup>1. Praising Heisenberg's emphasis on defining everything in terms of observable quantities only, Dirac then declares the two particles to be indistinguishable if *u*(12)*au*(12)∗ = *a* for any two-particle observable *a*; by unitarity, this is to say that *a* commutes with *u*(12). Dirac notes that such operators map symmetrized vectors (i.e. those ψ ∈ *H* ⊗*H* for which *u*(12)ψ = ψ) into symmetrized vectors, and likewise map anti-symmetrized vectors (i.e. those ψ ∈ *H* ⊗*H* for which *u*(12)ψ = −ψ) into anti-symmetrized vectors, and these are the only possibilities; we would now say that under the action of the S2-invariant (bounded) operators one has

$$H^2 \cong H\_+^2 \oplus H\_-^2;\tag{7.183}$$

$$H\_+^2 = \{ \Psi \in H^2 \mid \mu(12)\Psi = \Psi \};\tag{7.184}$$

$$H\_{-}^{2} = \{ \Psi \in H^{2} \mid \mu(12)\Psi = -\Psi \}. \tag{7.185}$$

Arguing that in order to avoid double counting (in that ψ and *u*(12)ψ should not both occur as independent states) one has to pick one of these two possibilities, Dirac concludes that state vectors of a system of two indistinguishable particles must be either symmetric or anti-symmetric. He then generalizes this to *N* identical particles: if (*i j*) is the element of the permutation group <sup>S</sup>*<sup>N</sup>* on *<sup>N</sup>* objects that permutes *<sup>i</sup>* and *<sup>j</sup>* (*i*, *<sup>j</sup>* <sup>=</sup> <sup>1</sup>,...,*N*), then according to Dirac, <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup><sup>N</sup>* <sup>≡</sup> *<sup>H</sup>*⊗*<sup>N</sup>* should satisfy either *<sup>u</sup>*(*i j*)<sup>ψ</sup> <sup>=</sup> <sup>ψ</sup>, in which case <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> <sup>+</sup>, or *<sup>u</sup>*(*i j*)<sup>ψ</sup> <sup>=</sup> <sup>−</sup>ψ, in which case <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> <sup>−</sup>, where *<sup>u</sup>* is the natural unitary representation of <sup>S</sup>*<sup>N</sup>* on *<sup>H</sup>N*, given, on *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>*N*, by linear (and if necessary continuous) extension of

$$
\mu(p)\Psi\_1 \otimes \cdots \otimes \Psi\_N = \Psi\_{p(1)} \otimes \cdots \otimes \Psi\_{p(N)}.\tag{7.186}
$$

Equivalently, <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> <sup>+</sup> if it is invariant under all permutations, and <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> <sup>−</sup> if it is invariant under even permutations and picks up a minus sign under odd permutations.

A slightly more sophisticated version of this argument often finds runs as follows:

'Since, in the case of indistinguishable particles, <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup><sup>N</sup>* and *<sup>u</sup>*(*p*)<sup>ψ</sup> must represent the same state for any *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>*N*, and since two unit vectors represent the same state iff they differ by a phase vector, by unitarity it must be that *u*(*p*)ψ = *c*(*p*)ψ, for some *c*(*p*) ∈ C satisfying |*c*(*p*)| = 1. The group property *u*(*pp* ) = *u*(*p*)*u*(*p* ) then implies that *c*(*p*) = 1 for even permutations and *c*(*p*) = ±1 for odd permutations. The choice +1 in the latter leads to bosons, whereas −1 leads to fermions, so these are the only possibilities.'

Alas, where Dirac's argument is incomplete, this one is even inconsistent: the claim that two unit vectors represent the same state iff they differ by a phase vector, presumes that the particles are distinguishable! Indeed, the only physical argument to the effect that two unit vectors ψ and ψ are equivalent iff ψ = *z*ψ with |*z*| = 1, is that it guarantees that expectation values coincide, i.e., that

$$
\langle \Psi \vert \Psi , a \Psi \rangle = \langle \Psi' , a \Psi' \rangle , \tag{7.187}
$$

for *all* (bounded) operators *a*, i.e., not merely for the permutation-invariant operators (in which case (7.187) does not follow). But, following Heisenberg and Dirac, the whole point of having indistinguishable particles is that an operator *a* represents a physical observable iff it is invariant under all permutations (acting by conjugation)!

Although the above arguments therefore seem feeble at best, their conclusion that only bosons and fermions can exist seems validated by Nature, despite the mathematical fact that the orthogonal complement of *H*<sup>2</sup> <sup>+</sup> <sup>⊕</sup>*H*<sup>2</sup> <sup>−</sup> in *HN* (describing particles with *parastatistics*) is non-zero as soon as *N* > 2. This should be a source of concern, and indeed, much research on indistinguishable particles (in *d* > 2) has had the goal of *explaining away parastatistics*. Distinguished by the different actions of <sup>S</sup>*<sup>N</sup>* they depart from, these explanations have traditionally been based on:

• *Quantum observables.* <sup>S</sup>*<sup>N</sup>* acts on the C\*-algebra *<sup>B</sup>*(*HN*) of bounded operators on *<sup>H</sup><sup>N</sup>* by conjugation of the unitary representation *<sup>u</sup>*(S*N*) on *<sup>H</sup>N*, cf. (7.186). One implements permutation invariance by postulating that the physical observables of the *<sup>N</sup>*-particle system under consideration be the S*N*-invariant operators: with *u* given by (7.186), the algebra of observables is therefore taken to be

$$M\_N = B(H^N)^{\mathfrak{S}\_N} \equiv \{ a \in B(H^N) \mid [a, u(p)] = 0 \, (p \in \mathfrak{S}\_N) \}.\tag{7.188}$$

• *Quantum states.* By restriction, <sup>S</sup>*<sup>N</sup>* then also acts on the (normal) state space

$$\mathcal{H}\_n(H^N) \cong \mathcal{Q}(H^N) \subset \mathcal{B}(H^N),\tag{7.189}$$

from which it is postulated that the physical state space is D(*HN*)S*<sup>N</sup>* .

• *Classical states.* <sup>S</sup>*<sup>N</sup>* acts on *<sup>M</sup>N*, the *<sup>N</sup>*-fold cartesian product of the classical one-particle phase space *M*, by permutation. If *M* = *T*∗*Q* for some configuration space *<sup>Q</sup>*, we might as well start from the natural action of <sup>S</sup>*<sup>N</sup>* on *<sup>Q</sup><sup>N</sup>* (pulled back to *MN*), and this is indeed what we shall do, often further simplifying to *Q* = R*d*. Unsurprisingly, the first two approaches equivalent. Define a linear map

$$E\_N: \mathcal{B}(H^N) \to \mathcal{B}(H^N)^{\mathfrak{S}\_N};\tag{7.190}$$

$$a \mapsto \frac{1}{n!} \sum\_{p \in \mathfrak{S}\_N} u(p) a u(p)^\*; \tag{7.191}$$

this is a (normal) *conditional expectation* from the von Neumann algebra *B*(*HN*) to the von Neumann algebra *<sup>B</sup>*(*HN*)S*<sup>N</sup>* , i.e., *EN*(*a*∗) = *EN*(*a*)<sup>∗</sup> for all *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*(*HN*), *E*2 *<sup>N</sup>* = *EN*, and *EN* = 1. Moreover, *EN* preserves positivity as well as the trace, so that it also maps the state space <sup>D</sup>(*HN*) onto the invariant states <sup>D</sup>(*HN*) <sup>⊂</sup> *<sup>B</sup>*(*HN*). Simple computations also establish the properties

$$\operatorname{Tr}\left(\rho a\right) = \operatorname{Tr}\left(E\_N(\rho)a\right) \text{ (\$\rho \in \mathcal{O}(H^N)\$, \$a \in \mathcal{B}(H^N)\$\$); \tag{7.192}$$

$$\operatorname{Tr}\left(\rho a\right) = \operatorname{Tr}\left(\rho E\_N(a)\right) \text{ (\$\rho \in \mathcal{O}(H^N)^{\mathfrak{S}\_N}\$, a \in B(H^N)\$). \tag{7.193}$$

Finally, the reduction of *<sup>H</sup><sup>N</sup>* under *<sup>u</sup>*(S*N*) described below may equally well be described in terms of the state space, since a subspace *eH<sup>N</sup>* <sup>⊂</sup> *<sup>H</sup><sup>N</sup>* (where *<sup>e</sup>* <sup>∈</sup>P(*HN*)is a projection) is stable under *<sup>u</sup>* iff *<sup>e</sup>* <sup>∈</sup> <sup>P</sup>(*HN*)S*<sup>N</sup>* , in which case it may be described in terms of the associated density operator <sup>ρ</sup> <sup>=</sup> *<sup>e</sup>*/Tr(*e*) <sup>∈</sup> <sup>D</sup>(*HN*)S*<sup>N</sup>* . With some more effort, in can be even be shown that <sup>ρ</sup> <sup>∈</sup> <sup>∂</sup>*e*(D(*HN*)S*<sup>N</sup>* ) iff *eH* is irreducible.

We may therefore focus on the first and the third approaches, starting with the first, based on (7.188). Note that the C\*-algebra of invariant *compact* operators, i.e.,

$$A\_N = B\_0(H^N)^{\mathfrak{S}\_N} \equiv \{ a \in B\_0(H^N) \mid [a, \mathfrak{u}(p)] = 0 \, (p \in \mathfrak{S}\_N) \},\tag{7.194}$$

induces the same decomposition of *H<sup>N</sup>* as *MN* does (since *M* = *A <sup>N</sup>*), so if *H* is infinite-dimensional one may use *AN* rather than *MN* as the algebra of quantum observables; this is convenient for comparison with the classical state space approach.

As long as dim(*H*) > 1 and *N* > 1, the algebras *MN* and *AN* act reducibly on *HN*. The reduction of *H<sup>N</sup>* under *MN* (and hence of *AN* and of *u*(*H*)*N*) is traditionally carried out by *Schur duality*. This rests on the following concepts.

Definition 7.11. • *A* partition λ *of N is a way of writing*

$$N = n\_1 + \dots + n\_k, \ n\_1 \ge \dots \ge n\_k > 0, \ k = 1, \dots, N. \tag{7.195}$$


The set Par(*N*) of all partitions <sup>λ</sup> of *<sup>N</sup>* parametrizes the conjugacy classes of <sup>S</sup>*<sup>N</sup>* and hence also the (unitary) dual of S*N*; in other words, up to (unitary) equivalence each (unitary) irreducible representation *<sup>u</sup>*<sup>λ</sup> of <sup>S</sup>*<sup>N</sup>* bijectively corresponds to some partition <sup>λ</sup> of *<sup>N</sup>*; the dimension of any vector space *<sup>V</sup>*<sup>λ</sup> carrying *<sup>u</sup>*<sup>λ</sup> is *<sup>N</sup>*<sup>λ</sup> <sup>=</sup> <sup>|</sup><sup>T</sup> *<sup>S</sup>* λ |, that is, the number of different standard Young tableaux on the frame *F*<sup>λ</sup> .

Returning to (7.186), to each λ ∈ Par(*N*) and each Young tableau *T* ∈ T<sup>λ</sup> we associate an operator *eT* on *H<sup>N</sup>* by the formula

$$e\_T = \frac{N\_\lambda}{N!} \sum\_{p \in \text{Col}(T)} \text{sgn}(p)\mu(p) \sum\_{p' \in \text{Row}(T)} \mu(p'),\tag{7.196}$$

which happens to be a projection. Its image *eTH<sup>N</sup>* <sup>⊂</sup> *<sup>H</sup><sup>N</sup>* is denoted by *<sup>H</sup><sup>N</sup> <sup>T</sup>* , and the restriction of *MN* to *H<sup>N</sup> <sup>T</sup>* is called *MN*(*T*). One may now write the decomposition of *H<sup>N</sup>* under the action of *MN* (up to unitary equivalence) as

$$H^{\mathcal{N}} \cong \bigoplus\_{\lambda \in \text{Par}(\mathcal{N})} H^{\mathcal{N}}\_{T\_{\lambda}} \otimes V\_{\lambda},\tag{7.197}$$

$$M\_N \cong \bigoplus\_{\lambda \in \text{Par}(N)} M\_N(T\_\lambda) \otimes 1\_{V\_\lambda},\tag{7.198}$$

$$\mu(\mathfrak{S}\_N) \cong \bigoplus\_{\mathbb{A} \in \text{Par}(N)} \mathbf{1}\_{H\_{T\_{\mathbb{A}}}^N \otimes \mu\_{\mathbb{A}}},\tag{7.199}$$

where the labeling is by the partitions λ of *N*, the multiplicity spaces *V*<sup>λ</sup> are irreducible <sup>S</sup>*N*-modules, and *<sup>T</sup>*<sup>λ</sup> is an arbitrary choice of a Young tableau defined on *F*<sup>λ</sup> . For simplicity we here assume that dim(*H*) ≥ *N*; if dim(*H*) < *N*, then only partitions (7.195) with *k* ≤ dim(*H*) occur. For example, the partitions (7.195) of *N* = 2 are 2 = 2 and 2 = 1 + 1, each of which admits only one standard Young tableau, which we denote by *S* and *A*, respectively. With *N*<sup>2</sup> = *N*1+<sup>1</sup> = 1 and hence *V*<sup>1</sup> ∼=*V*1+<sup>1</sup> ∼= C as vector spaces, this recovers (7.183); the corresponding projections *e*<sup>+</sup> and *e*−, respectively, are given by *e*<sup>+</sup> = <sup>1</sup> <sup>2</sup> (1+*u*(12)) and *e*<sup>−</sup> = <sup>1</sup> <sup>2</sup> (1−*u*(12)). The bosonic states <sup>ψ</sup>+, i.e., the solutions of <sup>ψ</sup><sup>+</sup> <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> <sup>+</sup>, or *e*+ψ<sup>+</sup> = ψ+, are just the symmetric vectors, whereas the fermionic states <sup>ψ</sup><sup>−</sup> <sup>∈</sup> *<sup>H</sup>*<sup>2</sup> <sup>−</sup> are the antisymmetric ones. These sectors exist for all *N* > 1 and they always occur with multiplicity one.

However, and this is the bite of the topic, for *N* ≥ 3 additional irreducible representations of *MN* appear, always with multiplicity greater than one; states in such sectors are said to describe *paraparticles* and/or are said to have *parastatistics*. For example, for *N* = 3 one new partition 3 = 2+1 occurs, with *N*2+<sup>1</sup> = 2, and hence

$$H^3 \cong H\_+^3 \oplus H\_-^3 \oplus H\_P^3 \oplus H\_{P^\*}^3,\tag{7.200}$$

where *H*<sup>3</sup> *<sup>P</sup>* and *H*<sup>3</sup> *<sup>P</sup>* are the images of the projections *eP* = <sup>1</sup> <sup>3</sup> (1 − *u*(13))(1 + *u*(12)) and *eP* = <sup>1</sup> <sup>3</sup> (1−*u*(12))(1+*u*(13)), respectively. The corresponding two classes of *parastates* (i.e. states carrying parastatistics) ψ*<sup>P</sup>* and ψ*<sup>P</sup>* then by definition satisfy *eP*ψ*<sup>P</sup>* = ψ*<sup>P</sup>* and *eP*ψ*<sup>P</sup>* = ψ*<sup>P</sup>* , respectively. In other words, the Hilbert spaces carrying each of the four sectors are the following closed linear spans:

$$H\_+^3 = \text{span}^- \{ \Psi\_{123} + \Psi\_{213} + \Psi\_{321} + \Psi\_{312} + \Psi\_{132} + \Psi\_{231} \};\qquad(7.201)$$

$$H\_{-}^{3} = \text{span}^{-}\{\Psi\_{123} - \Psi\_{213} - \Psi\_{321} + \Psi\_{312} - \Psi\_{132} + \Psi\_{231}\};\qquad(7.202)$$

$$H\_P^3 = \text{span}^-\{\Psi\_{123} + \Psi\_{213} - \Psi\_{321} - \Psi\_{312}\};\tag{7.203}$$

$$H\_{P'}^3 = \text{span}^- \{ \Psi\_{123} + \Psi\_{321} - \Psi\_{213} - \Psi\_{231} \},\tag{7.204}$$

where ψ*ijk* ≡ ψ*<sup>i</sup>* ⊗ψ*<sup>j</sup>* ⊗ψ*<sup>k</sup>* and the ψ*<sup>i</sup>* vary over *H* (and span<sup>−</sup> is closed linear span).

For any *N* > 2, let us note that instead of the decomposition (7.197) - (7.198), which is defined up to unitary equivalence, one may alternatively decompose *H<sup>N</sup>* as

$$H^N = \bigoplus\_{T \in \mathcal{F}\_\lambda^S, \lambda \in \text{Par}(N)} H\_T^N;\tag{7.205}$$

$$M\_N = \bigoplus\_{T \in \mathcal{T}\_\lambda^S, \lambda \in \text{Par}(N)} M\_N(T),\tag{7.206}$$

which has the advantage over (7.197) - (7.198) that the *H<sup>N</sup> <sup>T</sup>* are subspaces of *HN*. The disadvantage is that *MN*(*T*) is unitarily equivalent to *MN*(*T* ) iff *T* and *T* both lie in T *<sup>S</sup>* <sup>λ</sup> (i.e., for the same λ), so that unlike (7.197) - (7.198), the decomposition (7.205) - (7.206) is non-unique (for example, Young tableaux different from standard ones might have been chosen in the parametrization). The analogue of the third line (7.199) in the earlier decomposition would therefore be a mess. Indeed, although <sup>S</sup>*<sup>N</sup>* maps each of the subspaces *<sup>H</sup>*<sup>+</sup> and *<sup>H</sup>*<sup>−</sup> into itself (the former is even pointwise invariant under S*N*, whereas elements of the latter at most pick up a minus sign), this is no longer the case for parastatistics. For example, for *N* = 3 some permutations map *H*<sup>3</sup> *<sup>P</sup>* into *H*<sup>3</sup> *<sup>P</sup>* , and *vice versa*. This is clear from (7.205) - (7.206): for λ = *P*, one has dim(*VP*) = 2, and choosing a basis (υ1,υ2) of *VP* one may identify *H*⊗<sup>3</sup> *<sup>P</sup>* and *<sup>H</sup>*⊗<sup>3</sup> *<sup>P</sup>* in (7.205) with (say) *<sup>H</sup>*⊗<sup>3</sup> *<sup>P</sup>* <sup>⊗</sup>υ<sup>1</sup> and *<sup>H</sup>*⊗<sup>3</sup> *<sup>P</sup>* ⊗υ<sup>2</sup> in (7.197), respectively. And analogously for *N* > 3, where dim(*V*<sup>λ</sup> ) > 1 for all λ = *S*,*A*.

A (or perhaps *the*) competing approach to permutation invariance in quantum mechanics starts from classical (rather than quantal) data. Let *Q* be the classical single-particle configuration space, e.g., *Q* = R*d*; to avoid irrelevant complications, we assume that *Q* is a connected and simply connected manifold. The associated configuration space of *N* identical but distinguishable particles is *QN*. Depending on the assumption of (in)penetrability of the particles, we may define one of

$$
\check{\mathcal{Q}}\_N = \mathcal{Q}^N / \mathfrak{S}\_N; \tag{7.207}
$$

$$\mathcal{Q}\_N = (\mathcal{Q}^N \backslash \Delta\_N) / \mathfrak{S}\_N,\tag{7.208}$$

as the configuration space of *N* indistinguishable particles, where Δ*<sup>N</sup>* is the extended diagonal in *QN*, i.e., the set of points (*q*1,...,*qN*) <sup>∈</sup> *QN* where *qi* <sup>=</sup> *qj* for at least one pair (*i*, *<sup>j</sup>*), *<sup>i</sup>* <sup>=</sup> *<sup>j</sup>* (so that for *<sup>Q</sup>* <sup>=</sup> <sup>R</sup> and *<sup>N</sup>* <sup>=</sup> 2 this is the usual diagonal in <sup>R</sup>2). At first sight, these two choices should lead to exactly the same quantum theory, based on the Hilbert space *L*2(*Q*˘*N*) = *L*2(*QN*), since Δ*<sup>N</sup>* is a subset of measure zero for any measure used to define *L*<sup>2</sup> that is locally equivalent to Lebesgue measure.

However, the effect of Δ*<sup>N</sup>* is noticeable as soon as one represents physical observables as operators on *L*<sup>2</sup> through any serious quantization procedure, which should be sensitive to both the topological and the smooth structure of the underlying configuration space. In the case at hand, *QN* is multiply connected as a topological space, but as a manifold it is smooth and has no singularities. In contrast, *Q*˘*<sup>N</sup>* is simply connected as a topological space, but in the smooth setting it is a so-called *orbifold*. This leads to interesting complications, but following tradition (i.e., in the configuration space approach to indistinguishable particle) we continue with *QN*.

To quantize *QN* we use the language of Lie groupoids and their C\*-algebras, cf. §§C.16–C.17. Let *Q* be any (possibly) multiply connected manifold, with universal covering space *Q*˜. In particular, the first homotopy group π1(*Q*) acts (say from the right) on *Q*˜ in such a way that *Q* = *Q*˜/π1(*Q*). We denote the canonical projection by <sup>π</sup> : *<sup>Q</sup>*˜ <sup>→</sup> *<sup>Q</sup>*. One may have the example *<sup>Q</sup>* <sup>=</sup> <sup>T</sup>, *<sup>Q</sup>*˜ <sup>=</sup> <sup>R</sup>, <sup>π</sup>1(*Q*) = <sup>Z</sup> in mind here.

As a variation on the pair groupoid *G* = *Q*×*Q*, we now consider the Lie groupoid

$$
\mathcal{G}\_{\mathcal{Q}} = \mathcal{Q} \times\_{\pi\_1(\mathcal{Q})} \mathcal{Q},
\tag{7.209}
$$

whose elements are equivalence classes [*q*˜1,*q*˜2] in *<sup>Q</sup>*˜ <sup>×</sup>*Q*˜ under the equivalence relation ∼ defined by (*q*˜1,*q*˜2) ∼ (*q*˜ <sup>1</sup>,*q*˜ <sup>2</sup>) iff ˜*q*<sup>1</sup> = *q*˜ <sup>1</sup>*x* and ˜*q*<sup>2</sup> = *q*˜ <sup>2</sup>*x* for some *x* ∈ π1(*Q*); the source and target projections are *s*([*q*˜1,*q*˜2]) = π(*q*˜2) and *t*([*q*˜1,*q*˜2]) = π(*q*˜1), respectively, the inverse is [*q*˜1,*q*˜2] <sup>−</sup><sup>1</sup> = [*q*˜2,*q*˜1], and multiplication is the obvious one borrowed from the pair groupoid *<sup>Q</sup>*˜ <sup>×</sup>*Q*˜ over *<sup>Q</sup>*˜ (which is well defined on *<sup>G</sup>*˜*Q*). The tangent groupoid *G*˜*<sup>T</sup> <sup>Q</sup>* of *<sup>G</sup>*˜*<sup>Q</sup>* (cf. Proposition C.117) has the following fiber at *<sup>h</sup>*¯ <sup>=</sup> 0:

$$(\tilde{G}\_{\mathcal{Q}})\_0^T = T\mathcal{Q},\tag{7.210}$$

to be contrasted with the corresponding fiber *G<sup>T</sup>* <sup>0</sup> <sup>=</sup> *TQ*˜ of the pair groupoid on the covering space *Q*˜. In particular, for our configuration space *Q* = *QN* we have

$$
\tilde{G}\_{\mathcal{Q}\_N} = \tilde{\mathcal{Q}}\_N \times\_{\pi\_1(\mathcal{Q}\_N)} \tilde{\mathcal{Q}}\_N; \tag{7.211}
$$

$$(\tilde{G}\_{\mathcal{Q}\mathcal{N}})\_0^T = T\mathcal{Q}\_{\mathcal{N}},\tag{7.212}$$

which gives the fibers of the corresponding continuous bundle of C\*-algebras as

$$A\_0 = \mathcal{C}\_0(T^\*Q\_N) \ (\hbar = 0);\tag{7.213}$$

$$A\_{\hbar} = C^\*(\tilde{G}\_{\underline{Q}}) \quad \ (0 < \hbar \le 1), \tag{7.214}$$

cf. §C.19. This gives a generalization of the fibers (7.17) - (7.18) for *<sup>Q</sup>* <sup>=</sup> <sup>R</sup>*n*, and also now we have an example of Definition 7.1: the fibers (7.213) - (7.214) combine to form a continuous bundle of C\*-algebras with total C\*-algebra *A* =*C*∗(*G*˜*<sup>T</sup> Q*), yielding a deformation quantization of the Poisson manifold *T*∗*QN* (i.e., the usual phase space defined by the configuration space *QN*). We now define the *inequivalent quantizations* of *QN* as the inequivalent irreducible representations of the corresponding C\*-algebra of quantum observables *C*∗(*G*˜*QN* ), as follows.


$$H^{\lambda} = L^2(\mathcal{Q}) \otimes H\_{\lambda},\tag{7.215}$$

*where H*<sup>λ</sup> *is a specific carrier space for the representation u*<sup>λ</sup> *. More fancifully, one may use the Hilbert space L*2(*Q*,*E*<sup>λ</sup> ) *of L*2*-sections of the vector bundle*

$$E\_{\lambda} = \tilde{\mathcal{Q}} \times\_{\pi\_1(\mathcal{Q})} H\_{\lambda} \tag{7.216}$$

*associated to the principal bundle* <sup>π</sup> : *<sup>Q</sup>*˜ <sup>→</sup> *Q by the representation u*<sup>λ</sup> *.*

Provided one accepts (7.208), this theorem in principle gives a complete solution to the problem of quantizing multiply connected configuration spaces, and hence, taking *Q* = *QN*, of the problem of quantizing systems of indistinguishable particles.

*Proof.* We just prove Theorem 7.12 in the case we need, where π1(*Q*) is finite. Then

$$\mathcal{C}^\*(\tilde{\mathcal{Q}} \times\_{\pi\_1(\mathcal{Q})} \tilde{\mathcal{Q}}) \cong B\_0(L^2(\tilde{\mathcal{Q}}))^{\pi\_1(\mathcal{Q})};\tag{7.217}$$

$$B\_0(L^2(\tilde{Q}))^{\pi\_\mathbb{I}(Q)} \cong B\_0(L^2(Q)) \otimes C^\*(\pi\_\mathbb{I}(Q)),\tag{7.218}$$

where (in our usual notation) *B*0(*L*2(*Q*˜))π1(*Q*) is the C\*-algebra of π1(*Q*)-invariant compact operators on *L*2(*Q*˜), and *C*∗(π1(*Q*)) is the group C\*-algebra of π1(*Q*) (which is finite-dimensional and hence nuclear, given the assumption that π1(*Q*) is finite, so that the choice of the C\*-algebraic tensor product does not matter).

To prove (7.217), we first exploit finiteness of π1(*Q*) in order to identify functions *<sup>a</sup>*˜ <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G*˜*Q*) with constrained *C*<sup>∞</sup> *<sup>c</sup>* functions *<sup>a</sup>* on *<sup>Q</sup>*˜ <sup>×</sup>*Q*˜ that satisfy

$$a(\tilde{q}h, \tilde{q}'h) = a(\tilde{q}, \tilde{q}') \ (h \in \mathfrak{\pi}\_1(\mathbb{Q})).\tag{7.219}$$

This identification is explicitly given by

$$a(\tilde{q}, \tilde{q}') = \tilde{a}([\tilde{q}, \tilde{q}']),\tag{7.220}$$

where [*q*˜,*q*˜ ] denotes the equivalence class of (*q*˜,*q*˜ ) <sup>∈</sup> *<sup>Q</sup>*˜ <sup>×</sup> *<sup>Q</sup>*˜ under the diagonal action of π1(*Q*). This makes the space *C*<sup>∞</sup> *<sup>c</sup>* (*G*˜*Q*) a dense subset of *C*∗(*G*˜*Q*). We write *<sup>a</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*Q*˜ <sup>×</sup> *<sup>Q</sup>*˜)π1(*Q*) ; for (7.208) this just means that *a* is a permutation-invariant kernel. Second, we equip *Q*˜ with some measure *dq*˜ that is locally equivalent to the Lebesgue measure, and in addition is π1(*Q*)-invariant under the regular action *R* of π1(*Q*) on functions on *Q*˜, given, as usual, by *Rh*ψ˜(*q*˜) = ψ˜(*qh*˜ ). In that case, one also has a measure *dq* on *Q* that is locally equivalent to the Lebesgue measure, so that the measures *dq*˜ and *dq* on *Q*˜ and *Q*, respectively, are related by

$$\int\_{\tilde{Q}} d\tilde{q} \, f(\tilde{q}) = \frac{1}{|\pi\_1(Q)|} \sum\_{h \in \pi\_1(Q)} \int\_{\mathcal{Q}} dq \, f(s(q)h). \tag{7.221}$$

Here *<sup>f</sup>* <sup>∈</sup> *Cc*(*Q*˜), <sup>|</sup>π1(*Q*)<sup>|</sup> is the number of elements of <sup>π</sup>1(*Q*), and *<sup>s</sup>* : *<sup>Q</sup>* <sup>→</sup> *<sup>Q</sup>*˜ is any (measurable) cross-section of <sup>τ</sup> : *<sup>Q</sup>*˜ <sup>→</sup> *<sup>Q</sup>*. We may then define a Hilbert space *<sup>L</sup>*2(*Q*˜) with respect to *dq*˜, on which elements *a* of *C*<sup>∞</sup> *<sup>c</sup>* (*Q*˜ <sup>×</sup>*Q*˜)π1(*Q*) act faithfully by

$$a\Psi(\tilde{q}) = \int\_{\tilde{\mathcal{Q}}} d\tilde{q}' a(\tilde{q}, \tilde{q}') \Psi(\tilde{q}'). \tag{7.222}$$

The product of two such operators is given by the multiplication of the kernels on *Q*˜, and involution is defined as expected, too, namely by hermitian conjugation:

$$a^\*(\tilde{q}, \tilde{q}') = \overline{a(\tilde{q}', \tilde{q})}.\tag{7.223}$$

The norm-closure of *C*<sup>∞</sup> *<sup>c</sup>* (*Q*˜ <sup>×</sup>*Q*˜)π1(*Q*) , represented as operators on *L*2(*Q*˜) by (7.222), is then given by *B*0(*L*2(*Q*˜))π1(*Q*) . This proves (7.217).

Eq. (7.218) is a special case of the following: let *X* be a manifold carrying a *free* action of a *compact* group *G*. If *L*2(*X*) is defined by some *G*-invariant "locally Lebesgue" measure on *X*, as in the construction above, then one has an isomorphism

$$B\_0(L^2(X))^G \cong B\_0(L^2(X/G) \otimes \mathbb{C}^\*(G). \tag{7.224}$$

This is proved in a similar way, realizing *B*0(*H*) as the norm-completion of the Hilbert–Schmidt operators *B*2(*H*) (for general *H*), and, in the *L*2-case at hand, identifying *<sup>B</sup>*2(*L*2(*X*)) with the algebra of operators with kernels in *<sup>L</sup>*2(*<sup>X</sup>* <sup>×</sup>*X*).

Part 2 of the theorem now follows from the fact that for any Hilbert space *H* the C\*-algebra *B*0(*H*) of compact operators on *H* has exactly one irreducible representation (up to unitary equivalence), i.e. the defining one (this can be proved in many ways, e.g. from Rieffel's theory of Morita equivalence of C\*-algebras), combined with the bijective correspondence between continuous unitary representations *u* of any locally compact group *G* and non-degenerate representations of its associated group C\*-algebra *C*∗(*G*); see §C.18, Definition C.119 etc. -

As mentioned in Theorem 7.12, there are two ways of realizing the Hilbert space *H*<sup>λ</sup> , where λ labels some irreducible representation of π1(*Q*). This is very similar to the discussion in §7.5, so we will be relatively brief here. The first realization corresponds to having constrained wave-functions defined on the covering space *Q*˜; for example, the usual description of bosonic or fermonic wave-functions is of this sort. The second realization uses unconstrained wave-functions on the actual configuration space *Q* (*bad hombres* confusingly call such functions "multi-valued").

1. The space *C*∞(*Q*,*E*<sup>λ</sup> ) of *smooth* cross-sections of *E*<sup>λ</sup> may be given by the smooth maps <sup>ψ</sup>˜ : *<sup>Q</sup>*˜ <sup>→</sup> *<sup>H</sup>*<sup>λ</sup> satisfying the equivariance condition ("constraint")

$$
\Psi(\tilde{q}h) = \mu\_{\tilde{\lambda}}(h^{-1})\Psi(\tilde{q}),
\tag{7.225}
$$

for all *<sup>h</sup>* <sup>∈</sup> <sup>π</sup>1(*Q*), ˜*<sup>q</sup>* <sup>∈</sup> *<sup>Q</sup>*˜. The Hilbert space

$$H^{\hat{\lambda}} = L^2(\tilde{\mathcal{Q}}, H\_{\hat{\lambda}})^{\pi\_1(\mathcal{Q})},\tag{7.226}$$

then, is defined as the usual *<sup>L</sup>*2-completion of the space of all <sup>ψ</sup>˜ <sup>∈</sup> <sup>Γ</sup> (*Q*,*E*<sup>λ</sup> ) for which ψ˜ ,ψ˜ <sup>&</sup>lt; <sup>∞</sup>. The irreducible representation <sup>π</sup><sup>λ</sup> (*C*∗(*GQ*)) is then given on elements ˜*a* of the dense subspace *C*<sup>∞</sup> *<sup>c</sup>* (*GQ*) of *C*∗(*GQ*) by the expression

$$
\pi^{\dot{\lambda}}(\tilde{a})\psi(\tilde{q}) = \int\_{\tilde{Q}} d\tilde{q}' \,\tilde{a}([\tilde{q}, \tilde{q}'])\,\psi(\tilde{q}');\tag{7.227}
$$

any π1(*Q*)-invariant operator on *L*2(*Q*˜) acts on *H*<sup>λ</sup> in this way (ignoring *H*<sup>λ</sup> ). If π1(*Q*) is finite, then two simplifications occur. Firstly, *H*<sup>λ</sup> is finite-dimensional, and secondly each Hilbert space *H*<sup>λ</sup> may be regarded as a subspace of *L*2(*Q*˜); the above action of *C*∗(*GQ*) on *H*<sup>λ</sup> is then simply given by restriction of its action on *L*2(*Q*˜). In that case one may equivalently realize this irreducible representation in terms of the right-hand side of (7.217), in which case the action of π<sup>λ</sup> (*a*) on *H*<sup>λ</sup> as defined in (7.226) is given by

$$
\pi^{\dot{\lambda}}(a)\psi(\tilde{q}) = \int\_{\tilde{Q}} d\tilde{q}' a(\tilde{q}, \tilde{q}')\psi(\tilde{q}').\tag{7.228}
$$

This is true as it stands if *<sup>a</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*Q*˜ <sup>×</sup>*Q*˜)π1(*Q*) , cf. (7.219), and may be extended to general <sup>π</sup>1(*Q*)-invariant compact operators *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*0(*L*2(*Q*˜))π1(*Q*) by norm continuity, and, furthermore, even to *B*(*L*2(*Q*˜))π1(*Q*) by strong or weak continuity.

2. Elements of the Hilbert space *L*2(*Q*˜,*H*<sup>λ</sup> )π1(*Q*) are typically (equivalence classes of) *discontinuous* cross-sections of *E*<sup>λ</sup> . Possibly discontinuous cross-sections may simply be given directly as functions ψ : *Q* → *H*<sup>λ</sup> , with inner product

$$
\langle \Psi \vert \Psi \rangle = \int\_{\mathcal{Q}} dq \, \langle \Psi(q), \Phi(q) \rangle\_{H\_{\hbar}}.\tag{7.229}
$$

This specific realization of *<sup>L</sup>*2(*Q*,*E*<sup>λ</sup> ) will be denoted by *<sup>L</sup>*2(*Q*)⊗*H*<sup>λ</sup> . If *<sup>H</sup>*<sup>λ</sup> <sup>=</sup> <sup>C</sup>,

$$L^2(\mathcal{Q}) \otimes H\_{\lambda} \cong L^2(\mathcal{Q}).\tag{7.230}$$

These equivalent descriptions of π<sup>λ</sup> may be related once a (typically discontinuous) cross-section <sup>σ</sup> : *<sup>Q</sup>* <sup>→</sup> *<sup>Q</sup>*˜ of the projection <sup>τ</sup> : *<sup>Q</sup>*˜ <sup>→</sup> *<sup>Q</sup>* has been chosen (i.e., <sup>τ</sup> ◦<sup>σ</sup> <sup>=</sup> id*Q*), in which case ψ(*q*) = ψ˜(σ(*q*)). We formalize this in terms of a unitary

$$\mu: L^2(\tilde{\mathcal{Q}}, H\_{\lambda})^{\pi\_1(\mathcal{Q})} \to L^2(\mathcal{Q}) \otimes H\_{\lambda} \tag{7.231}$$

$$
u \tilde{\Psi}(q) = \tilde{\Psi}(\sigma(q));\tag{7.232}$$

$$
\mu^{-1}\Psi(\tilde{q}) = u\_{\lambda}(h)\Psi(q),\tag{7.233}
$$

where *q* = τ(*q*˜), and *h* is the unique element of π1(*Q*) for which ˜*qh* = σ(*q*). The action π<sup>λ</sup> <sup>σ</sup> (*a*) = *<sup>u</sup>*π<sup>λ</sup> (*a*)*u*−<sup>1</sup> on *<sup>L</sup>*2(*Q*)⊗*H*<sup>λ</sup> now follows from (7.228) - (7.233): If *<sup>a</sup>* is a π1(*Q*)-invariant kernel on *L*2(*Q*˜), then using (7.221) we obtain

$$\pi\_{\sigma}^{\lambda}(a)\Psi(q) = \sum\_{h \in \mathfrak{T}\_1(\mathcal{Q})} \int\_{\mathcal{Q}} dq' a(\sigma(q), \sigma(q')h)\mu\_{\lambda}(h)\Psi(q'). \tag{7.234}$$

We now apply this formalism to *N* indistinguishable particles moving on the (single-particle) configuration space R3. Eq. (7.208) then gives the *N*-particle space

$$\mathcal{Q}\_N = ( (\mathbb{R}^3)^N - \Delta\_N ) / \mathfrak{S}\_N. \tag{7.235}$$

The universal covering space of this multiply connected space is

$$\tilde{\mathcal{Q}}\_N = \mathring{\mathbb{R}}^{3N} \equiv (\mathbb{R}^3)^N - \Delta\_N,\tag{7.236}$$

which (unlike its counterpart in *d* = 2) is connected and simply connected, so that

$$
\pi\_1(Q\_N) = \mathfrak{S}\_N. \tag{7.237}
$$

It follows from (7.217) and (7.237) that the algebra of observables is given by

$$C^\*(\tilde{G}\_{\mathcal{Q}N}) = B\_0(L^2(\mathbb{R}^3)^{\otimes N})^{\mathfrak{S}\_N}.\tag{7.238}$$

Comparing (7.238) with (7.194), we obtain a complete equivalence between the "quantum observables" approach and the deformation quantization approach based on Theorem 7.12, in that the configuration space approach through the representation theory of the groupoid C\*-algebra *C*∗(*G*˜*QN* ) leads to the same classification as the "quantum observables" approach based in (7.188) above, cf. (7.197) - (7.199).

We discuss a few interesting special cases.

$$\mathbf{N} = \mathbf{1}. \quad \text{Here } \tilde{Q}\_{\text{l}} = Q\_{\text{l}} = \mathbb{R}^3 \text{ and } \pi\_{\text{l}}(Q\_{\text{l}}) = \{e\}, \text{ so the algebra of observables is:}$$

$$C^\*(\tilde{G}\_{\mathcal{Q}\_1}) = B\_0(L^2(\mathbb{R}^3)),\tag{7.239}$$

which has a unique irreducible representation on *L*2(R3).

N = 2. This time, the pertinent homotopy group is

$$
\pi\_1(Q\_2) = \mathfrak{S}\_2 = \mathbb{Z}\_2 = \{e, (12)\},
\tag{7.240}
$$

which has two irreducible representations: firstly, *uB*(*p*) = 1 for both *<sup>p</sup>* <sup>∈</sup> S2, and secondly, *uF*(*e*) = 1, *uF*(12) = −1, each realized on *H*<sup>λ</sup> = C. Hence with *<sup>q</sup>* = (*x*, *<sup>y</sup>*,*z*) <sup>∈</sup> <sup>R</sup>3, eq. (7.225) yields

$$H\_B^2 = \{ \Psi \in L^2(\mathbb{R}^3)^2 \mid \Psi(q\_2, q\_1) = \Psi(q\_1, q\_2) \};\tag{7.241}$$

$$H\_F^2 = \{ \Psi \in L^2(\mathbb{R}^3)^2 \mid \Psi(q\_2, q\_1) = -\Psi(q\_1, q\_2) \}. \tag{7.242}$$

Here *<sup>L</sup>*2(R3)<sup>2</sup> <sup>≡</sup> *<sup>L</sup>*2(R3)⊗*L*2(R3) <sup>∼</sup><sup>=</sup> *<sup>L</sup>*2(R6). The C\*-algebra

$$C^\*(\tilde{G}\_{\mathcal{Q}\_2}) = B\_0(L^2(\mathbb{R}^3) \otimes L^2(\mathbb{R}^3))^{\mathfrak{S}\_2} \cong B\_0(L^2(\mathbb{R}^3 \times \mathbb{R}^3))^{\mathfrak{S}\_2} \tag{7.243}$$

consists of all <sup>S</sup>2-invariant compact operators on *<sup>L</sup>*2(R<sup>3</sup> <sup>×</sup>R3), acting on *<sup>H</sup>*<sup>2</sup> *<sup>B</sup>* or *H*2 *<sup>F</sup>* in the same way as they do on *L*2(R6); cf. (7.228), noting that the constraints in (7.241) and (7.242) are preserved due to the <sup>S</sup>2-invariance of *<sup>A</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*G*˜*Q*<sup>2</sup> ). This recovers Dirac's description of statistics given earlier in this section.

#### N = 3. Here we have a non-abelian homotopy group

$$
\mathfrak{m}\_{\mathbb{I}}(Q\_{\mathbb{3}}) = \mathfrak{S}\_{\mathbb{3}},\tag{7.244}
$$

which, besides the irreducible boson and fermion representations on C, has an irreducible *parafermionic* representation *uP* on *HP* = C2. This representation is most easily obtained explicitly by reducing the natural action of <sup>S</sup><sup>3</sup> on <sup>C</sup>3. Define an orthonormal basis of the latter by

$$e\_0 = \frac{1}{\sqrt{3}} \begin{pmatrix} 1 \\ 1 \\ 1 \end{pmatrix};\ e\_1 = \frac{1}{\sqrt{2}} \begin{pmatrix} 0 \\ 1 \\ -1 \end{pmatrix};\ e\_2 = \frac{1}{\sqrt{6}} \begin{pmatrix} -2 \\ 1 \\ 1 \end{pmatrix}. \tag{7.245}$$

It follows that <sup>C</sup> · *<sup>e</sup>*<sup>0</sup> carries the trivial representation of <sup>S</sup>3, whereas the linear span of *e*<sup>1</sup> and *e*<sup>2</sup> carries a two-dimensional irreducible representation *uP*, given on the generators (12), (13), and (23) of <sup>S</sup><sup>3</sup> by

$$
\mu\_P(12) = \frac{1}{2} \begin{pmatrix} 1 & -\sqrt{3} \\ -\sqrt{3} & -1 \end{pmatrix}; \mu\_P(13) = \frac{1}{2} \begin{pmatrix} 1 & \sqrt{3} \\ \sqrt{3} & -1 \end{pmatrix}; \mu\_P(23) = \begin{pmatrix} -1 & 0 \\ 0 & 1 \end{pmatrix}. \tag{7.246}
$$

We already gave realizations of the Hilbert space *H*<sup>3</sup> *<sup>P</sup>* of three parafermions in (7.203) and (7.204),where it emerged as a subspace of *<sup>L</sup>*2(R3) <sup>⊗</sup> *<sup>L</sup>*2(R3) <sup>⊗</sup> *<sup>L</sup>*2(R3) <sup>∼</sup><sup>=</sup> *<sup>L</sup>*2(R<sup>3</sup> <sup>×</sup> <sup>R</sup><sup>3</sup> <sup>×</sup> <sup>R</sup>3). An equivalent realization *<sup>H</sup><sup>P</sup>* <sup>≡</sup> *<sup>H</sup>*˜ <sup>3</sup> *<sup>P</sup>* may be given on the basis of (7.225), according to which *<sup>H</sup><sup>P</sup>* is the subspace of *<sup>L</sup>*2(R3)3⊗C<sup>2</sup> <sup>∼</sup><sup>=</sup> *<sup>L</sup>*2(R9)⊗C<sup>2</sup> that consists of doublet wave-functions <sup>ψ</sup>*<sup>i</sup>* (*<sup>i</sup>* <sup>=</sup> <sup>1</sup>,2) that satisfy

$$
\Psi\_l(q\_{p(1)}, q\_{p(2)}, q\_{p(3)}) = \sum\_{j=1}^2 \mu\_{ij}(p)\Psi\_j(q\_1, q\_2, q\_3),\tag{7.247}
$$

for any permutation *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>3, where *<sup>u</sup>* <sup>≡</sup> *uP*, cf. (7.246). I.e., the parafermionic wave-functions in this realization of *H*<sup>3</sup> *<sup>P</sup>* are constrained by the conditions

$$
\Psi\_1(q\_2, q\_1, q\_3) = \frac{1}{2} \Psi\_1(q\_1, q\_2, q\_3) - \frac{1}{2} \sqrt{3} \,\Psi\_2(q\_1, q\_2, q\_3);\tag{7.248}
$$

$$
\Psi\_2(q\_2, q\_1, q\_3) = -\frac{1}{2}\sqrt{3}\,\Psi\_1(q\_1, q\_2, q\_3) - \frac{1}{2}\Psi\_2(q\_1, q\_2, q\_3);\qquad(7.249)
$$

$$\Psi\_{1}(q\_{3},q\_{2},q\_{1}) = \frac{1}{2}\Psi\_{1}(q\_{1},q\_{2},q\_{3}) + \frac{1}{2}\sqrt{3}\,\Psi\_{2}(q\_{1},q\_{2},q\_{3});\qquad(7.250)$$

$$\Psi\_{2}(q\_{1},q\_{2},q\_{3}) = \frac{1}{2}\sqrt{3}\,\mu\_{\left(q\_{1},q\_{2},q\_{3}\right)}\left[\,\left(\mu\_{\left(q\_{1},q\_{2},q\_{3}\right)}\right)\right].\qquad(7.251)$$

$$
\Psi\_2(q\_3, q\_2, q\_1) = \frac{1}{2}\sqrt{3}\,\Psi\_1(q\_1, q\_2, q\_3) - \frac{1}{2}\Psi\_2(q\_1, q\_2, q\_3);\tag{7.251}
$$

$$
\Psi\_1(q\_1, q\_3, q\_2) = -\Psi\_1(q\_1, q\_2, q\_3);\tag{7.252}
$$

$$
\Psi\_2(q\_1, q\_3, q\_2) = \Psi\_2(q\_1, q\_2, q\_3). \tag{7.253}
$$

The algebra of observables *<sup>C</sup>*∗(*G*˜*Q*<sup>3</sup> ) of three indistinguishable particles without internal degrees of freedom, i.e., then acts on *<sup>H</sup><sup>P</sup>* <sup>⊂</sup> *<sup>L</sup>*2(R3)<sup>3</sup> <sup>⊗</sup>C<sup>2</sup> as in (7.234), identifying *<sup>a</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*G*˜*Q*<sup>3</sup> ) with *<sup>a</sup>* <sup>⊗</sup> 12 (so that *<sup>a</sup>* ignores the internal degree of freedom C2). This representation π*<sup>P</sup>* is irreducible by Theorem 7.12.

N > 3. The above construction may be generalized to any *N* > 3. There will now be many parafermionic representations *<sup>u</sup>*<sup>λ</sup> of <sup>S</sup>*<sup>N</sup>* (given by Young tableaus), each of which induces an irreducible representation of the C\*-algebra (7.238).

The question now arises whether parastatistics is to be found in Nature—or, indeed, if this question is even well defined! As a warm-up to the case *N* = 3, where the question first plays a role, let us give an alternative realization of <sup>π</sup>*F*(*C*∗(*G*˜*Q*<sup>2</sup> )), cf. Theorem 7.12. Take two isospin doublet bosons (which by definition transform under the defining spin- <sup>1</sup> <sup>2</sup> representation *D*1/<sup>2</sup> of *SU*(2) on C2). With

$$H^{(2)} = (L^2(\mathbb{R}^3) \otimes \mathbb{C}^2)^{\otimes 2},\tag{7.254}$$

and using indices *a*1,*a*<sup>2</sup> = 1,2, the Hilbert space of these bosons is

$$H\_B^{(2)} = \{ \Psi \in H^{(2)} \mid (\Psi\_{a\_2 a\_1}(q\_2, q\_1) = \Psi\_{a\_1 a\_2}(q\_1, q\_2)) \},\tag{7.255}$$

with corresponding projection *e* (2) *<sup>B</sup>* : *<sup>H</sup>*(2) <sup>→</sup> *<sup>H</sup>*(2) *<sup>B</sup>* given by

$$\left(e\_{\mathbf{B}}^{(2)}\Psi\_{a\_1a\_2}(q\_1,q\_2) = \frac{1}{2}(\Psi\_{a\_2a\_1}(q\_2,q\_1) + \Psi\_{a\_1a\_2}(q\_1,q\_2)).\right.\tag{7.256}$$

Subsequently, define a partial isometry *<sup>w</sup>* : *<sup>H</sup>*(2) <sup>→</sup> *<sup>L</sup>*2(R3)⊗<sup>2</sup> by

$$\text{cov}\Psi(q\_1, q\_2) \equiv \Psi\_0(q\_1, q\_2) = \frac{1}{\sqrt{2}}(\Psi\_{12}(q\_1, q\_2) - \Psi\_{21}(q\_1, q\_2)).\tag{7.257}$$

Physically, this singles out an isospin singlet Hilbert subspace *H*(0) = *e*0*H*(2) within *H*(2) , where *e*<sup>0</sup> = *w*∗*w* (which is a projection). This singlet subspace may be constrained to the bosonic sector by passing to

$$H\_B^{(0)} = e\_0 e\_B^{(2)} H^{(2)};\tag{7.258}$$

note that *e*<sup>0</sup> and *e* (2) *<sup>B</sup>* commute. Now extend the defining representation of *<sup>C</sup>*∗(*G*˜*Q*<sup>2</sup> ) on *L*2(R3)⊗<sup>2</sup> to *H*(2) by ignoring the indices *a*1,*a*<sup>2</sup> (i.e., isospin is deemed unobservable). This extended representation commutes with *e*<sup>0</sup> and with *e* (2) *<sup>B</sup>* , and hence is well defined on *H*(0) *<sup>B</sup>* <sup>⊂</sup> *<sup>H</sup>*(2) . Let us denote this representation of *<sup>G</sup>*˜*Q*<sup>2</sup> by <sup>π</sup>(0) *<sup>B</sup>* . It is then immediate from the property ψ0(*q*2,*q*1) = −ψ0(*q*1,*q*2) that:

Proposition 7.13. *The representations* π(0) *<sup>B</sup>* (*C*∗(*G*˜*Q*<sup>2</sup> )) *on H*(0) *<sup>B</sup> and* <sup>π</sup>*F*(*C*∗(*G*˜*Q*<sup>2</sup> )) *on H<sup>F</sup> are unitarily equivalent.*

In other words, two *fermions* without internal degrees of freedom are equivalent to the singlet state of two *bosons* with an isospin degrees of freedom, at least if the observables are isospin-blind. Similarly, two *bosons* without internal degrees of freedom are equivalent to the singlet state of two *fermions* with isospin, and two fermions without internal degrees of freedom are equivalent to the isospin triplet state of two fermions (this corresponds to the Schur decomposition of (C2)⊗<sup>2</sup> under the commuting actions of <sup>S</sup><sup>2</sup> and *SU*(2)).

#### 7.7 Quantization and permutation symmetry 287

For *N* = 3 we may carry out a similar trick as for *N* = 2, and replace parafermions without (further) degrees of freedom by either bosons or fermions. We discuss the former and leave the explicit description of the various alternative descriptions to the reader. We proceed as for *N* = 2, *mutatis mutandis*. We have a Hilbert space

$$H^{(3)} = (L^2(\mathbb{R}^3) \otimes \mathbb{C}^2)^{\otimes 3},\tag{7.259}$$

of three distinguishable isospin doublets, containing the Hilbert space *H*(3) *<sup>B</sup>* of three bosonic isospin doublets as a subspace, that is,

$$H\_B^{(3)} = \{ \Psi \in H^{(3)} \mid \Psi\_{a\_{p(1)}a\_{p(2)}a\_{p(3)}}(q\_{p(1)}, q\_{p(2)}, q\_{p(3)}) = \Psi\_{a\_1a\_2a\_3}(q\_1, q\_2, q\_3) \,(p \in \mathfrak{S}\_3) \}. \tag{7.260}$$

The corresponding projection, denoted by *e* (3) *<sup>B</sup>* : *<sup>H</sup>*(3) <sup>→</sup> *<sup>H</sup>*(3) *<sup>B</sup>* , will not be written down explicitly. Define an *SU*(2) doublet (ψ1,ψ2) within the space *H*(3) through a partial isometry

$$\text{w}: H^{(3)} \to L^2(\mathbb{R}^3)^{\otimes 3} \otimes \mathbb{C}^2;\tag{7.261}$$

$$\text{cov}\Psi\_1(q\_1, q\_2, q\_3) = \frac{1}{\sqrt{2}}(\Psi\_{121}(q\_1, q\_2, q\_3) - \Psi\_{112}(q\_1, q\_2, q\_3));\tag{7.262}$$

$$
\hbar\Psi\Psi\_2(q\_1, q\_2, q\_3) = \frac{1}{\sqrt{6}}(-2\Psi\_{211}(q\_1, q\_2, q\_3) + \Psi\_{121}(q\_1, q\_2, q\_3) + \Psi\_{112}(q\_1, q\_2, q\_3)).\tag{7.263}
$$

Defining a projection *e*<sup>2</sup> = *w*∗*w* on *H*(3) , the Hilbert space *H*(3) contains a closed subspace *H*(2) *<sup>B</sup>* = *e*2*e* (3) *<sup>B</sup> H*(3) , which is stable under the natural representation of *<sup>C</sup>*∗(*G*˜*Q*<sup>3</sup> ) (since *<sup>e</sup>*<sup>2</sup> and *<sup>e</sup>* (3) *<sup>B</sup>* commute). We call this representation <sup>π</sup>(2) *<sup>B</sup>* . An easy calculation then establishes:

Proposition 7.14. *The representations* π(2) *<sup>B</sup>* (*C*∗(*G*˜*Q*<sup>3</sup> )) *on H*(2) *<sup>B</sup> and* <sup>π</sup>*P*(*C*∗(*G*˜*Q*<sup>3</sup> )) *on H<sup>P</sup> (as defined by Theorem 7.12) are unitarily equivalent.*

In other words, three *parafermions* without internal degrees of freedom are quivalent to an isospin doublet formed by three identical *bosonic* isospin doublets (corresponding to the Schur decomposition of (C2)⊗<sup>3</sup> under the commuting actions of <sup>S</sup><sup>3</sup> and *SU*(2); in this decomposition, the spin 3/2 representation of *SU*(2) couples to the bosonic representation of S3, whilst the spin- <sup>1</sup> <sup>2</sup> representation of *SU*(2) couples to the parafermionic representation of S3), at least if the observables of the latter are isospin-blind. Many other realizations of parafermions in terms of fermions or bosons with an internal degree of freedom can be constructed in a similar way.

For *N* > 3 we similarly find that the representation of the C\*-algebra (7.238) induced by some parafermionic representations *<sup>u</sup>*<sup>χ</sup> of <sup>S</sup>*<sup>N</sup>* is unitarily equivalent to a representation on some *SU*(*n*) multiplet of bosons with an internal degree of freedom; the appropriate multiplet is the one coupled to *u*<sup>χ</sup> in the Schur reduction of (C*n*)⊗*<sup>N</sup>* with respect to the natural and commuting actions of <sup>S</sup>*<sup>N</sup>* and *SU*(*n*).

The moral of this story is that one cannot tell from glancing at some Hilbert space whether the world consists of fermions or bosons or parafermions; what matters is the Hilbert space *as a carrier of some (irreducible) representation of the algebra of observables*. From that perspective we already see for *N* = 2 that being bosonic or fermionic is not an invariant property of such representations, since one may freely choose between fermions/bosons *without* internal degrees of freedom and bosons/fermions *with* internal degrees of freedom. In a more systematic discussion using superselection theory one may impose some physical selection criterion in order to restrict attention to "physically interesting" sectors. Such criteria (which, for example, would have the goal of excluding parastatistics) should be formulated with reference to some algebra of observables. Such issues cannot be settled at the level of quantum mechanics and instead require quantum field theory, where parastatistics can always be removed in terms of either bose- or fermi-statistics, in somewhat similar vein to our discussion. For (nonlocal) charges in gauge theories there are no rigorous results, but historically a similar goal played a role in the road to quantum chromodynamics (QCD), which is one of the ingredients of the Standard Model.

A different argument against parastatistics arises from the state space approach based on the compact convex set D(*HN*)S*<sup>N</sup>* studied at the beginning of this section. The extreme boundary ∂*<sup>e</sup>* D(*HN*)S*<sup>N</sup>* consists of one part that is contained in ∂*e*D(*HN*) = P1(*HN*), and one that is not. The first part consists of those onedimensional invariant projections *<sup>e</sup>* <sup>∈</sup> <sup>P</sup>1(*HN*)S*<sup>N</sup>* whose image *eH<sup>N</sup>* belongs to either the bosonic subspace *H<sup>N</sup>* <sup>+</sup> (in which case *<sup>u</sup>*(*p*)*<sup>e</sup>* <sup>=</sup> *<sup>e</sup>* for each *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>*N*) or the fermionic subspace *H<sup>N</sup>* <sup>−</sup> of *<sup>H</sup><sup>N</sup>* (in which case *<sup>u</sup>*(*p*)*<sup>e</sup>* <sup>=</sup> sgn(*p*)*<sup>e</sup>* for each *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>*N*); in other words, pure bosonic on fermionic states on *B*(*HN*)S*<sup>N</sup>* are also pure on *B*(*HN*). The second part, then, consists of parastatistical pure states on *B*(*HN*)S*<sup>N</sup>* , which are therefore mixed on *B*(*HN*). Furthermore, pure bosonic or fermionic states on *B*(*HN*)S*<sup>N</sup>* both extend and restrict to pure bosonic or fermionic states on *B*(*HN*+1)S*N*+<sup>1</sup> and *B*(*HN*−1)S*N*−<sup>1</sup> , respectively, whereas parastatistical pure states turn out to have neither property and hence are "isolated" at the given value of *N*.

Finally, in *d* = 2 the equivalence between the operator and configuration space approaches breaks down, because <sup>S</sup>*<sup>N</sup>* <sup>=</sup> <sup>π</sup>1(*QN*) = *BN*, i.e., the braid group on *<sup>N</sup>* strings. Even defining the operator quantum theory on *HN* = *L*2(*Q*˜*N*), with algebra of observables *MN* = *B*(*L*2(*Q*˜*N*))*BN* , fails to rescue the equivalence, because the decomposition of *HN* under *MN* by no means contains all irreducible representations of *BN*. In this case deformation quantization gives many more sectors than the improved operator approach (which already gave more sectors than the approach using 'multi-valued' scalar wave-functions).

Notes 289

## Notes

The quotations in the preamble are from Dirac (1947), p. 87. Similarly, the *Dreimanner- ¨ arbeit* (Born, Heisenberg, & Jordan, 1926) bluntly states (in Ch. 1, §1) that:

'one can see from eq. (5) [i.e., *pq*−*qp* = −*ih*¯ · 1*<sup>H</sup>* , cf. our eq. (7.5)] that in the limit *h*¯ = 0 , the new theory would converge to the classical theory, as is physically required.'

## §7.1. Deformation quantization

In the wake of Dirac's famous insight on the analogy between the Poisson bracket and the commutator in quantum mechanics, the idea of deformation quantization (in the form of what we now call *star products*) may be traced back to Groenewold (1946) and Moyal (1949). The mathematical (physics) literature on the subject started with Berezin (1975) and Bayen et al (1978), who introduced what we now call *formal deformation quantization*, in which *h*¯ is not a real number but a formal parameter occurring in formal power series. The C\*-algebraic setting for deformation quantization we use was introduced by Rieffel (1989, 1994); see also Landsman (1998a), Chapter 2, for a detailed treatment.

## §7.2. Quantization and internal symmetry

This section is based on Rieffel (1990) and Landsman (1998a), Chapter 3.


The action Poisson bracket (7.58) was introduced by Krishnaprasad & Marsden (1987); see also Marsden & Ratiu (1994).

Systems of imprimitivity and their applications to representation theory, semidirect products, and quantum mechanics are due to Mackey (1958, 1968), who was inspired by Weyl (1927, 1928), von Neumann (1932), and Wigner (1939). As Mackey (1978, 1992) describes, he saw his work as the development of what he calls *Weyl's Program*. Weyl (1927) posed two questions in quantum mechanics:


Weyl considered the second question to have been resolved by von Neumann's recent work, and so he concentrated on the first, which he tried to answer using group theory. The main achievement of Weyl (1927), elaborated in his subsequent

<sup>1</sup> Like Hilbert himself, Weyl at the time still thought of operators in terms of matrices or Hermitian forms, rather than abstractly, like von Neumann. Also cf. our Introduction.

<sup>2</sup> 'Wie komme ich zu der Matrix, der Hermiteschen Form, welche eine gegebene Große in einem ¨ seiner Konstitution nach bekannten physikalischen System reprasentiert?' (Weyl, 1927, p. 1) ¨

<sup>3</sup> 'Wenn einmal die Hermitesche Form gewonnen ist, was ist ihre physikalische Bedeutung, was fur physikalische Aussagen kann ich ihr entnehmen?' ( ¨ *ibid.*)

book Weyl (1928), was a reformulation of the canonical commutation relations *i*[*p*,*q*] = *h*¯ · 1*<sup>H</sup>* in terms of projective unitary representations of the additive group R<sup>2</sup> (or, equivalently, of unitary representations of the associated Heisenberg group). He also introduced the formula (7.21) in an equivalent form where the (classical) Fourier expansion of *f* , i.e.,

$$f(p,q) = \int\_{\mathbb{R}^2} d\boldsymbol{a} d\boldsymbol{b} \, e^{i\boldsymbol{a}\boldsymbol{p} + i\boldsymbol{b}\boldsymbol{q}} \hat{f}(\boldsymbol{a}, \boldsymbol{b}),\tag{7.264}$$

is "quantized" by the operator in which exp(*iap* + *ibq*) in the above formula is replaced by the (projective) unitary representative *<sup>u</sup>*(*a*,*b*) of (*a*,*b*) <sup>∈</sup> <sup>R</sup><sup>2</sup> just mentioned, i.e., the real numbers *p* and *q* are replaced by the corresponding operators ˆ*p* and ˆ*q*, as in (7.3) - (7.4). In particular, Weyl treated *p* and *q* symmetrically.

In his development of Weyl's Program, Mackey broke the symmetry between *p* and *q*, in that he saw the momentum operator ˆ*p* as the ("infinitesimal") generator of a unitary representation of the additive group R, whereas the position operator *q*ˆ was replaced by a projection-valued measure on the real line; this is equivalent to a nondegenerate representation of the commutative C\*-algebra *C*0(*Q*), as in our discussion in §7.3. This way of tearing *p* and *q* apart was the key to the general case of quantizing group actions on configuration space discussed in §7.3.

In their independent elaboration of Weyl's ideas, Groenewold (1946) and Moyal (1949) emphasized the deformation aspect of quantization (including the classical limit) rather than its group-theoretical underpinning; the former aspect is completely absent in Mackey's work. "The Big Picture" (Landsman, 1998a, Ch. 3; Landsman & Ramazan, 2001; Landsman, 2007) is an attempt to have the best of both worlds, in that the role of Lie groupoids delivers the symmetry aspect of quantization, whereas our (i.e. Rieffel's) very definition of quantization puts the deformation aspect in the front seat. The underlying theory of Lie groupoids and Lie algebroids may be found in Moerdijk & Mrcun (2003) or Mackenzie (2005); see also Landsman (1998a). ˇ

A comprehensive study of the Mackey–Glimm dichotomy may be found in Williams (2007), which contains a wealth of information on crossed product C\* algebras and induced representations in general.

The representation theory of the Poincare-group was first studied (using some- ´ what heuristic methods) by Wigner (1939) using induced representations. The entire subject was subsequently taken up and finished by Mackey. For treatments in the spirit of (mathematical) physics see e.g. Simms (1968), Niederer & O'Raifeartaigh (1974), and Barut & Rac¸ka (1977). Lemma 7.10 is proved by Bargmann (1954).

Among the known elementary particles, the case *j* = 0 (and *m* > 0) corresponds to the Higgs boson, whereas *j* = <sup>1</sup> <sup>2</sup> gives all known fermionic particles (i.e., electrons, quarks, neutrino's, and their antiparticles). If one counts the gauge bososn *W*<sup>±</sup> and *Z*<sup>0</sup> as massive, they provide the case *j* = 1, but in the fundamental Lagrangian they are massless and correspond to helicity *n* = ±1, like the photon. Helicity ±2 gives the graviton. We discard particles predicted by supersymmetry, which evidently does not exist in nature (this evidence seems lost on string theorists).

#### §7.7. Quantization and permutation symmetry

This section is based on Landsman (2016a). The literature on indistinguishable particles is enormous, initiated by Heisenberg (1926) and Dirac (1926). What we call the "quantum observables" approach goes back to Messiah & Greenberg (1964); see also Druhl, Haag, & Roberts (1970). Key papers in the configuration space ¨ approach are Souriau (1967), Laidlaw & DeWitt-Morette (1971) and Leinaas & Myrheim (1977). More generally, for the quantization of multiply connected space see Dowker (1972), Schulman (1981), Isham (1984), Horvathy, Morandi, & Sudarshan (1989), Morchio & Strocchi (2007), and Morandi (1992). The state space approach to indistinguishable particles was proposed by Bach (1997), who proves (7.192) - (7.193), as well as the claim following these equations to the effect that <sup>ρ</sup> <sup>∈</sup> <sup>∂</sup>*e*D(*HN*)S*<sup>N</sup>* iff *eH* is irreducible. The state space arguments against parastatistics given near the end of this section are also due to Bach (1997).

The representation theory used in this section may be found in many books, such as Weyl (1928), Fulton (1997), or Goodman & Wallach (2000).

The groupoid (7.209) is a special case of the so-called *gauge groupoid* defined by a principal *H*-bundle *P* <sup>π</sup> → *Q*, where *G*<sup>1</sup> = *P*×*<sup>H</sup> P* (which stands for (*P*×*P*)/*H* with respect to the diagonal *H*-action on *P*×*P*), *G*<sup>0</sup> = *Q*, and the operations are

$$s([p,q]) = \pi(q), \; t([p,q]) = \pi(p), \; [\mathbf{x}, \mathbf{y}]^{-1} = [\mathbf{y}, \mathbf{x}], \; [p,q][q,r] = [p,r];$$

here [*p*,*q*][*q* ,*r*] is defined whenever π(*q*) = π(*q* ), but to write down the product one picks some element *<sup>q</sup>* <sup>∈</sup> <sup>π</sup>−1(*<sup>q</sup>* ).

Recent philosophical literature on indistinguishable particles includes French & Krause (2006), Earman (2010), Caulton & Butterfield (2012), Saunders (2013), and Baker, Halvorson, & Swanson (2015). This philosophical literature stills needs to be integrated with the mathematical approach launched in this section, and it was indeed the goal of Earman, Halvorson, & Landsman (2013ish) to do so. Alas!

## Chapter 8 Limits: large *N*

Beside the limit *h*¯ → 0, we consider the limit *N* → ∞, where *N* could be the principal quantum number labeling orbits in atomic physics (as in Bohr's Correspondence Principle), or the number of particles or lattice sites, or the number of identical experiments in a long run measuring the relative frequencies of possible outcomes.

The case of large quantum numbers will be dealt with first: as our toy model of an classical orbit we take a *coadjoint orbit* in the dual g<sup>∗</sup> of the Lie algebra <sup>g</sup> of a compact connected Lie group *G*, see §5.9; for *G* = *SU*(2) or *SO*(3) these are simply two-spheres *S*<sup>2</sup> *<sup>r</sup>*. The corresponding quantum theories are indexed by their spin *j* = <sup>1</sup> <sup>2</sup> *n*, where *n* ∈ N, which we send to infinity in order to recover the classical orbit. This can be done more generally by rescaling the highest weight λ of some fixed irreducible representation of *G* to *n*λ and again letting *n* → ∞.

The second case, where the limit *N* → ∞ is typically the thermodynamic limit (namely if the density *N*/*V* is kept fixed, where *V* is the volume of the system sent to infinity, too), has been rigorously studied using operator algebras since the 1960s. In such work the system constructed *at* the limit *N* = ∞ is typically quantum statistical mechanics in infinite volume, whose existence (followed by the establishment of e.g. phase transitions) was a major achievement of mathematical physics.

However, our goal in taking the limit *N* → ∞ is quite different, in that—in the spirit of Bohrification—our limiting system will be *classical*; from the traditional point of view we look at the macroscopic rather than the quasi-local observables. Nonetheless, for each finite value of *N* ∈ N our (quantum) system will be the same as in the usual theory! Like the first case, in which increasingly large matrix algebras converge to an algebra of continuous functions on some compact space, this apparent miracle is described by the theory of continuous bundles of C\*-algebras, as outlined in §C.19. As in the case *h*¯ → 0 studied in the previous chapter, this theory provides a convenient mathematical machinery for studying the limit *N* → ∞ also.

We then apply the the limit *N* → ∞ to *N* repeated experiments, and, applying the doctrine of classical concepts, rederive the Born rule (avoiding the conceptual and mathematical pitfalls of various previous attempts to do so).

Bridging the gap to the next two chapters, we close with an introduction to quantum spin systems (as a later playing ground for spontaneous symmetry breaking).

#### 8.1 Large quantum numbers

As in §5.9, let *<sup>G</sup>* be a compact connected Lie group with Lie algebra g and dual g∗, and let *<sup>T</sup>* <sup>⊂</sup> *<sup>G</sup>* be a maximal torus with Lie algebra t and dual t <sup>∗</sup>. Let O<sup>λ</sup> be a *regular integral coadjoint orbit* in g∗, labeled by a dominant weight <sup>λ</sup> <sup>∈</sup> <sup>Λ</sup>*d*. This means that there is a point <sup>θ</sup> <sup>∈</sup> <sup>O</sup><sup>λ</sup> whose stabilizer *<sup>G</sup>*<sup>θ</sup> is *<sup>T</sup>*, and <sup>λ</sup> <sup>=</sup> <sup>θ</sup>|t; conversely, <sup>λ</sup> <sup>∈</sup> t <sup>∗</sup> determines <sup>θ</sup> <sup>∈</sup> <sup>g</sup>∗, which vanishes on each generator *<sup>E</sup>*<sup>α</sup> of <sup>g</sup><sup>C</sup> (<sup>α</sup> <sup>∈</sup> <sup>Δ</sup>).

Following Theorems 5.49 and 5.51, we associate a unitary irreducible representation *u*<sup>λ</sup> : *G* → *U*(*H*<sup>λ</sup> ) to O<sup>λ</sup> (or rather to λ), whose underlying Hilbert space *H*<sup>λ</sup> contains a unique highest weight vector υλ . We then have (5.228). We abbreviate

$$d\_{\lambda} = \dim(H\_{\lambda}).\tag{8.1}$$

For *SU*(2) we have λ ∈ N0/2 = {0, <sup>1</sup> <sup>2</sup> ,1,...}, usually called *j*, and the (regular) coadjoint orbits in <sup>g</sup><sup>∗</sup> <sup>∼</sup><sup>=</sup> <sup>R</sup><sup>3</sup> are the spheres *<sup>S</sup>*<sup>2</sup> *<sup>j</sup>* with radius *j* (with *j* = 0). The corresponding highest weight representation *uj* is carried by *Hj* with *dj* = 2 *j* +1, whose highest weight vector υ*<sup>j</sup>* is an eigenvector of *L*<sup>3</sup> = *iu* (*S*3) with eigenvalue *j*.

We are going to define a continuous bundle of C\*-algebras over the base space

$$I = (1/\mathbb{N}) \cup \{\mathbf{0}\} \equiv 1/\mathbb{N},\tag{8.2}$$

where <sup>N</sup> <sup>=</sup> {1,2,...} and *<sup>N</sup>*˙ <sup>=</sup> <sup>N</sup>∪ {∞}; as required, *<sup>I</sup>* contains 0 as an accumulation point. One may think of elements of *I* as "quantized" values of Planck's constant *h*¯ = 1/*N*, upon which the limit *N* → ∞ is formally the same as the limit *h*¯ → 0.

If λ ∈ Λ*d*, then *n*λ ∈ Λ*<sup>d</sup>* for all *n* ∈ N. We may therefore define the C\*-algebras

$$A\_0 = \mathcal{C}(\mathcal{O}\_\lambda);\tag{8.3}$$

$$A\_{1/n} = B(H\_{n\lambda}).\tag{8.4}$$

For each *f* ∈ *C*(O<sup>λ</sup> ) we define *f*<sup>λ</sup> = π<sup>∗</sup> *f* under the canonical projection π : *G* → *G*/*G*<sup>θ</sup> ∼= O<sup>λ</sup> (i.e., *f*<sup>λ</sup> (*x*) = *f*(π(*x*))), which enables us to define the operators

$$\mathcal{Q}\_{1/n}(f) = d\_{n\lambda} \int\_G d\mathbf{x} \, f\_{\lambda}(\mathbf{x}) |\mu\_{n\lambda}(\mathbf{x}) \mathfrak{u}\_{n\lambda}\rangle \langle \mu\_{n\lambda}(\mathbf{x}) \mathfrak{v}\_{n\lambda}| \in A\_{1/n}. \tag{8.5}$$

In fact, the entire integrand in (8.5) is a function on O<sup>λ</sup> , because for *z* ∈ *T* we have

$$
\mu\_{n\lambda}(\mathbf{x}\boldsymbol{\varepsilon})\mathfrak{v}\_{n\lambda} = \mu\_{n\lambda}(\boldsymbol{\chi})\mu\_{n\lambda}(\boldsymbol{\varepsilon})\mathfrak{v}\_{n\lambda} = \mathfrak{X}\_{n\lambda}(\boldsymbol{\varepsilon})\mu\_{n\lambda}(\boldsymbol{\chi})\mathfrak{v}\_{n\lambda},
$$

and χ*n*<sup>λ</sup> (*z*) ∈ T cancels the factor χ*n*<sup>λ</sup> (*z*) from the last term in (8.5). Note that

$$Q\_{1/n}(1\_{\mathcal{O}\_\lambda}) = 1\_{H\_{n\lambda}},\tag{8.6}$$

as follows by taking ψ<sup>2</sup> = ψ<sup>3</sup> = υ*n*<sup>λ</sup> in Schur's well-known orthogonality relations

$$d\_{n\lambda} \int\_G d\mathbf{x} \, \langle \Psi\_1, \mu\_{n\lambda}(\mathbf{x}) \Psi\_2 \rangle \langle \mu\_{n\lambda}(\mathbf{x}) \Psi\_3, \Psi\_4 \rangle = \langle \Psi\_1, \Psi\_4 \rangle \langle \Psi\_3, \Psi\_2 \rangle \, (\Psi\_i \in H\_{n\lambda}). \quad (8.7)$$

Other properties of the maps *Q*1/*<sup>n</sup>* : *C*(O<sup>λ</sup> ) → *B*(*Hn*<sup>λ</sup> ) (between C\*-algebras) are:


$$\mathcal{Q}\_{1/n}(L\_{\mathfrak{Y}}f) = \mathfrak{u}\_{n\mathfrak{A}}(\mathfrak{y})\mathcal{Q}\_{1/n}(f)\mathfrak{u}\_{n\mathfrak{A}}(\mathfrak{y})^{\*}.\tag{8.8}$$

Positivity does not follows from self-adjointness, as *Q*1/*<sup>n</sup>* is not a homomorphism.

Theorem 8.1. *There exists a continuous bundle of C\*-algebras A over I as defined in* (8.2)*, with fibers* (8.3) *-* (8.4)*, whose continuous sections are given by all sequences* (*a*1/*n*)*n*∈N˙ <sup>∈</sup> <sup>∏</sup>*n*∈N˙ *<sup>A</sup>*1/*<sup>n</sup> for which a*<sup>0</sup> <sup>∈</sup> *<sup>C</sup>*(O<sup>λ</sup> ) *and a*1/*<sup>n</sup>* <sup>∈</sup> *<sup>B</sup>*(*Hn*<sup>λ</sup> )*, and the sequence* (*a*1/*n*)*n*∈<sup>N</sup> *is asymptotically equivalent to* (*Q*1/*n*(*a*0))*n*∈N*, in the sense that*

$$\lim\_{n \to \infty} \|a\_{1/n} - \mathcal{Q}\_{1/n}(a\_0)\| = 0. \tag{8.9}$$

In particular, if *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(O<sup>λ</sup> ), then the cross-section of <sup>∏</sup>*n*∈N˙ *<sup>A</sup>*1/*<sup>n</sup>* defined by

$$a\_0 = f;\tag{8.10}$$

$$a\_{1/n} = \mathcal{Q}\_{1/n}(f),\tag{8.11}$$

is continuous. In fact, we have a deformation quantization of O<sup>λ</sup> in the sense of Definition 7.1, where the Poisson structure of O<sup>λ</sup> is inherited from (minus) the canonical one on the Poisson manifold g∗, but we shall merely prove the claim of the theorem.

*Proof.* This will follow from Proposition C.124, in whose notation *A*˜ (which will actually coincide with *A*) consists of all ˜*a* = (*a*˜*h*¯)*h*¯∈*<sup>I</sup>* where *f* runs through *C*(O<sup>λ</sup> ) in

$$
\tilde{a}\_0 = f;\tag{8.12}
$$

$$
\tilde{a}\_{1/n} = \mathcal{Q}\_{1/n}(f). \tag{8.13}
$$

To verify the conditions for Proposition C.124 we start with the property that the set {*a*˜*h*¯ <sup>|</sup> *<sup>a</sup>*˜ <sup>∈</sup> *<sup>A</sup>*˜} be dense in *Ah*¯ ; we will show that it even coincides with *Ah*¯ . At *<sup>h</sup>*¯ <sup>=</sup> <sup>0</sup> this is true by construction. At *h*¯ = 1/*n*, the required property

$$Q\_{1/n}(C(\mathcal{O}\_\lambda)) = B(H\_{n\lambda})\tag{8.14}$$

can be proved in two steps. For simplicity we set *n* = 1; the proof is the same for any *n* ∈ N. The first step is to define a function *La* on *G* for each *a* ∈ *B*(*H*<sup>λ</sup> ) by

$$L\_a(\mathbf{x}) = \text{Tr}(a|\mu\_\lambda(\mathbf{x})\mathfrak{v}\_\lambda) \langle \mu\_\lambda(\mathbf{x})\mathfrak{v}\_\lambda|) = \langle \mathfrak{v}\_\lambda, \mu\_\lambda(\mathbf{x})^\* a \mu\_\lambda(\mathbf{x})\mathfrak{v}\_\lambda \rangle. \tag{8.15}$$

This function is continuous and is right-invariant under *T*, so that *La* is really an element of *C*(O<sup>λ</sup> ). Thus we have a map *L* : *B*(*H*<sup>λ</sup> ) → *C*(O<sup>λ</sup> ), *a* → *La*. Furthermore,

$$
\langle a, \mathcal{Q}\_{\mathbb{I}}(f) \rangle\_{HS} = \langle L\_a, f \rangle\_2,\tag{8.16}
$$

where the Hilbert–Schmidt inner product on left-hand side is *a*,*bHS* = Tr(*a*∗*b*), cf. (B.495)—which is well defined since *H*<sup>λ</sup> is finite-dimensional—and the righthand side is the inner product on *L*2(O<sup>λ</sup> ) with respect to the measure induced by the subspace of *<sup>L</sup>*2(*G*,*d*<sup>λ</sup> ·*dx*) consisting of *<sup>T</sup>*-invariant functions. Now *<sup>Q</sup>*1/*n*(*C*(O<sup>λ</sup> )) is a (necessarily closed) linear subspace of *B*(*H*<sup>λ</sup> ), which coincides with *B*(*H*<sup>λ</sup> ) iff its orthogonal complement in the Hilbert–Schmidt inner product is zero.

Hence (8.14) is equivalent to the implication: *a* ∈ (*Q*1/*n*(*C*(O<sup>λ</sup> )))<sup>⊥</sup> ⇒ *a* = 0. By (8.16), the antecedent holds iff *La*, *f*<sup>2</sup> = 0 for each *f* ∈ *C*(O<sup>λ</sup> ), which, because *C*(O<sup>λ</sup> ) is dense in *L*2(O<sup>λ</sup> ), holds iff *La* = 0. Hence the the above implication is equivalent to: *La* = 0 ⇒ *a* = 0, i.e., ker*L* = {0}. We must therefore prove the latter.

If *La*(*x*) = 0 for all *x* ∈ *G*, then, taking *x* = exp(*t*1*A*1)···exp(*tnAn*), where each *Ai* <sup>∈</sup> <sup>g</sup>, and applying (5.156) for each *ti* to the right-hand side of (8.15), we obtain

$$\langle \mathfrak{v}\_{\mathbb{X}}, [\boldsymbol{\mu}\_{\mathbb{X}}'(\boldsymbol{A}\_{n}), \dots, [\boldsymbol{\mu}\_{\mathbb{X}}'(\boldsymbol{A}\_{2}), [\boldsymbol{\mu}\_{\mathbb{X}}'(\boldsymbol{A}\_{1}), \boldsymbol{a}]] \cdots ] \mathfrak{v}\_{\mathbb{X}} \rangle = 0. \tag{8.17}$$

This equality extends to <sup>g</sup>C, so we may take *Ai* <sup>=</sup> *<sup>E</sup>*α*<sup>i</sup>* for some positive root <sup>α</sup>*<sup>i</sup>* <sup>∈</sup> <sup>Δ</sup>+. Since *u* <sup>λ</sup> (*E*α)υλ <sup>=</sup> 0 for <sup>α</sup> <sup>∈</sup> <sup>Δ</sup>+, of each commutator [*<sup>u</sup>* <sup>λ</sup> (*E*α*<sup>i</sup>* ),*a*] only the term *u* <sup>λ</sup> (*E*α*<sup>i</sup>* )*a* contributes. Moving the *u* <sup>λ</sup> (*E*α*<sup>i</sup>* ) to act as *u* <sup>λ</sup> (*E*α*<sup>i</sup>* )∗ = *u* <sup>λ</sup> (*E*−α*<sup>i</sup>* ) on the vector on the left in the inner product in (8.17) gives all other eigenvectors of t, so that (8.17) implies ψ,*a*υλ = 0 for each ψ ∈ *H*<sup>λ</sup> , and hence *a*υλ = 0. Now it is clear from (8.15) that *Lu*<sup>λ</sup> (*y*)∗*au*<sup>λ</sup> (*<sup>y</sup>*)(*x*) = *La*(*yx*), so if *La*(*x*) = 0 for all *x* ∈ *G*, then also *Lu*<sup>λ</sup> (*y*)∗*au*<sup>λ</sup> (*<sup>y</sup>*)(*x*) = 0 for all *x* ∈ *G*. Hence we may replace *a* by *u*<sup>λ</sup> (*y*)∗*au*<sup>λ</sup> (*y*) in the above argument, finding *u*<sup>λ</sup> (*y*)∗*au*<sup>λ</sup> (*y*)υλ = 0 and hence *au*<sup>λ</sup> (*y*)υλ = 0 for each *y* ∈ *G*. Since *u*<sup>λ</sup> is irreducible, this implies *a*ψ = 0 for any ψ ∈ *H*<sup>λ</sup> , and hence *a* = 0.

This completes the proof of (8.14). Proposition C.124 furthermore requires

$$\lim\_{n \to \infty} \|Q\_{1/n}(f)\| = \|f\|\_{\infty},\tag{8.18}$$

This follows from the following key property (to be proved at the end):

$$\lim\_{n \to \infty} \langle u\_{n\lambda}(\mathbf{y}) \mathfrak{v}\_{n\lambda}, \mathcal{Q}\_{1/n}(f) u\_{n\lambda}(\mathbf{y}) \mathfrak{v}\_{n\lambda} \rangle = f\_{\lambda}(\mathbf{y}), \tag{8.19}$$

for any *y* ∈ *G* and *f* ∈ *C*(O<sup>λ</sup> ). Indeed, for any *y* ∈ *G* we obviously have

$$\|\|Q\_{1/n}(f)\|\| \ge \langle u\_{n\lambda}(\mathbf{y})\mathfrak{u}\_{n\lambda}, Q\_{1/n}(f)u\_{n\lambda}(\mathbf{y})\mathfrak{u}\_{n\lambda} \rangle. \tag{8.20}$$

Since *G* and hence O<sup>λ</sup> is compact, by Weierstrass's Theorem there is an *y* ∈ *G* such that | *f*<sup>λ</sup> (*y*)| = *f* ∞. Using this *y* in (8.20) and (8.19), the two of these imply

$$\lim\inf\_{n\to\infty} \|Q\_{1/n}(f)\| \ge \|f\|\_{\infty}.\tag{8.21}$$

Conversely, for any unit vector ψ ∈ *Hn*<sup>λ</sup> , eqs. (8.5) and (8.7) imply

$$\langle \Psi, Q\_{1/n}(f)\Psi \rangle = |\langle \Psi, Q\_{1/n}(f)\Psi \rangle| \le ||f||\_{\infty}.\tag{8.22}$$

If *f* is real-valued, then *Q*1/*n*(*f*)<sup>∗</sup> = *Q*1/*n*(*f* <sup>∗</sup>) = *Q*1/*n*(*f*). In that case, (8.22) implies

$$\|\|\mathcal{Q}\_{1/n}(f)\|\leq \|f\|\_{\infty}.\tag{8.23}$$

By the C\*-identity *a*∗*a* <sup>=</sup> *a*2, this is true for any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(O<sup>λ</sup> ). Therefore,

$$\limsup\_{n \to \infty} \|\mathcal{Q}\_{1/n}(f)\| \le \|f\|\_{\infty}.\tag{8.24}$$

Eqs. (8.21) and (8.24) yield (8.18). It remains to prove (8.19), i.e.,

$$\lim\_{n \to \infty} d\_{n\lambda} \int\_G dx \, f\_{\lambda}(\mathbf{x}) |\langle u\_{n\lambda}(\mathbf{y}) \boldsymbol{\upsilon}\_{n\lambda}, u\_{n\lambda}(\mathbf{x}) \boldsymbol{\upsilon}\_{n\lambda} \rangle|^2 = f\_{\lambda}(\mathbf{y}).\tag{8.25}$$

The key to the proof is the fact that if λ and μ are dominant weights, with associated highest weight representations *u*<sup>λ</sup> and *u*<sup>μ</sup> , respectively, for any *x* ∈ *G* one has

$$
\langle \mathfrak{v}\_{\lambda}, \mathfrak{u}\_{\lambda}(\mathbf{x}) \mathfrak{v}\_{\lambda} \rangle \cdot \langle \mathfrak{v}\_{\mu}, \mathfrak{u}\_{\mu}(\mathbf{x}) \mathfrak{v}\_{\mu} \rangle = \langle \mathfrak{v}\_{\lambda + \mu}, \mathfrak{u}\_{\lambda + \mu}(\mathbf{x}) \mathfrak{v}\_{\lambda + \mu} \rangle. \tag{8.26}
$$

Namely, because the exponential map is surjective for compact connected Lie groups, eq. (8.26) is equivalent to the property

$$
\langle \mathfrak{v}\_{\lambda}, \mathfrak{u}'\_{\lambda}(A)\mathfrak{v}\_{\lambda} \rangle + \langle \mathfrak{v}\_{\mu}, \mathfrak{u}'\_{\mu}(A)\mathfrak{v}\_{\mu} \rangle = \langle \mathfrak{v}\_{\lambda + \mu}, \mathfrak{u}'\_{\lambda + \mu}(A)\mathfrak{v}\_{\lambda + \mu} \rangle,\tag{8.27}
$$

for any *<sup>A</sup>* <sup>∈</sup> g. For *<sup>A</sup>* <sup>∈</sup> t this amounts to <sup>λ</sup> <sup>+</sup> <sup>μ</sup> <sup>=</sup> <sup>λ</sup> <sup>+</sup> <sup>μ</sup>, cf. (5.228), whereas for *<sup>A</sup>* <sup>=</sup> *<sup>E</sup>*<sup>α</sup> for some root <sup>α</sup> <sup>∈</sup> <sup>Δ</sup> we have 0 <sup>=</sup> 0, so that (8.27) is true for all *<sup>A</sup>* <sup>∈</sup> <sup>g</sup>. This also proves (8.26), of which we need the special (and iterated) case

$$
\langle \mathfrak{v}\_{n\lambda}, \mathfrak{u}\_{n\lambda}(\mathbf{x}) \mathfrak{v}\_{n\lambda} \rangle = \langle \mathfrak{v}\_{\lambda}, \mathfrak{u}\_{\lambda}(\mathbf{x}) \mathfrak{v}\_{\lambda} \rangle^{n}. \tag{8.28}
$$

This motivates us to introduce a sequence (μ*n*) of probability measures on *G* by

$$d\mu\_n(\mathbf{x}) = d\_{n\lambda} \cdot d\mathbf{x} \left| \langle \mathfrak{v}\_{\lambda}, \mu\_{\lambda}(\mathbf{x}) \mathfrak{v}\_{\lambda} \rangle \right|^{2n}, \tag{8.29}$$

so that, after a change *x* → *yx* of the integration variable, eq. (8.25) reads

$$\lim\_{n \to \infty} d\_{n\lambda} \int\_G d\mu\_n(\mathbf{x}) f\_\lambda(\mathbf{y}\mathbf{x}) = f\_\lambda(\mathbf{y}), \tag{8.30}$$

for any *f* ∈ *C*(O<sup>λ</sup> ). Now *F*(*x*) = |υλ ,*u*<sup>λ</sup> (*x*)υλ | takes values in (0,1] and hence the measure (8.29) is *d*μ*n*(*x*) ∼ exp(−*nS*(*x*)) for *S*(*x*) = −ln(*F*(*x*)), with *S* ≥ 0 and *S*(*x*) = 0 iff *x* ∈ *G*θλ = *T* (using regularity of the orbit). In that case, i.e., if *z* ∈ *T*, then *f*<sup>λ</sup> (*yz*) = *f*(π(*yz*)) = *f*(π(*y*)) = *f*<sup>λ</sup> (*y*). The method of steepest descent shows that any part of *G* (of positive Haar measure) where *S*(*x*) > 0 makes no contribution as *n* → ∞, so that we may replace *f*<sup>λ</sup> (*yx*) in (8.30) by *f*<sup>λ</sup> (*y*), obtaining

$$\lim\_{n \to \infty} \int\_G d\mu\_n(\mathbf{x}) \, f\_\lambda(\mathbf{y}\mathbf{x}) = f\_\lambda(\mathbf{y}) \lim\_{n \to \infty} \int\_G d\mu\_n(\mathbf{x}) = f\_\lambda(\mathbf{y}) \lim\_{n \to \infty} \mathbf{l} = f\_\lambda(\mathbf{y}). \tag{8.31}$$

We have now verified conditions 1 and 2 in Proposition C.124, and no. 3 is trivially satisfied since in condition 1 we have equality with *Ah*¯ , as shown above. -

#### 8.2 Large systems

We now move from large quantum numbers within a single system to large quantum systems that consist of *N* identical sites, where we eventually study what happens as *N* → ∞ (as is customary in quantum statistical mechanics we change notation from *<sup>n</sup>* <sup>∈</sup> <sup>N</sup> to *<sup>N</sup>* <sup>∈</sup> <sup>N</sup>). This limit gives rise to two different continuous bundles *<sup>A</sup>*(*q*) and *A*(*c*) of C\*-algebras over *I* as given by (8.2), which have exactly the same fibers at 1/*N* but, amazingly, differ dramatically at *N* = ∞, i.e., 1/*N* = 0. This difference reflects two choices one may make for the *N*-particle observables that have a limit as *N* → ∞, namely *local* ones, giving rise to a highly *non-commutative* limit algebra *A*(*q*) <sup>0</sup> (which is the one usually studied in quantum statistical mechanics of infinite systems), and *macroscopic* ones, which generate a *commutative* algebra *A*(*c*) of observables of an infinite quantum system (describing classical thermodynamics as a limit of quantum statistical mechanics). It is the latter that we need for Bohrification.

Let *B* be a fixed *unital* C\*-algebra, describing a single quantum system. The case of a two-level system, where *B* = *M*2(C), is already fascinating, and many other interesting examples are described by finite-dimensional C\*-algebras. Though irrelevant in finite dimension, we note that the constructions below are generally valid if (for technical reasons to be found in Proposition C.97) we use the *projective* tensor product <sup>⊗</sup><sup>ˆ</sup> max between C\*-algebras; see §C.13. For any *<sup>N</sup>* <sup>∈</sup> <sup>N</sup> we put

$$A\_{1/N}^{(c)} = A\_{1/N}^{(q)} = \mathcal{B}^N,\tag{8.32}$$

i.e., the *<sup>N</sup>*-fold (projective) tensor product <sup>⊗</sup><sup>ˆ</sup> *<sup>N</sup>* max*B* of *B* with itself. Furthermore,

$$A\_0^{(c)} = \mathcal{C}(\mathcal{S}(B));\tag{8.33}$$

$$A\_0^{(q)} = B^{\circ \circ},\tag{8.34}$$

where *S*(*B*) is the state space of *B*, seen as a compact convex set in the weak∗ topology, as usual, and *B*<sup>∞</sup> is the infinite (projective) tensor product of *B* with itself as described in §C.14; see especially (C.318) with *Ci* = *B* for each *i*. For example, the state space of *B* = *M*2(C) is affinely homeomorphic to the unit ball in R3, whose boundary is the familiar Bloch sphere of qubits; see Proposition 2.9.

We now explain how (8.32) and (8.33) - (8.34) give rise to continuous bundles *<sup>A</sup>*(*c*) and *<sup>A</sup>*(*q*) of C\*-algebras, starting with the former. First, for each *<sup>N</sup>* <sup>∈</sup> <sup>N</sup>, let <sup>S</sup>*<sup>N</sup>* be the permutation group (i.e. symmetric group) on *N* objects, acting on *BN* in the obvious way, i.e., by linear and continuous extension of

$$a\_p^{(N)}(b\_1 \otimes \cdots \otimes b\_N) = b\_{p(1)} \otimes \cdots \otimes b\_{p(N)},\tag{8.35}$$

where *bi* <sup>∈</sup> *<sup>B</sup>*. This yields a *symmetrization operator SN* : *BN* <sup>→</sup> *BN* defined by

$$S\_N = \frac{1}{N!} \sum\_{p \in \mathfrak{S}\_N} \alpha\_p^{(N)}.\tag{8.36}$$

If *B* is infinite-dimensional, these maps can be extended by continuity to the completion *<sup>B</sup>*<sup>∞</sup> <sup>=</sup> <sup>⊗</sup><sup>ˆ</sup> <sup>∞</sup> max*<sup>B</sup>* of the *algebraic* tensor product <sup>⊗</sup>∞*B*; indeed, passing to any faithful representation of *B* it is easy to see that *S<sup>N</sup>* is even continuous with respect to the minimal cross-norm (cf. §C.13). For *N* ≥ *M* we then define

$$S\_{M,N}: \mathcal{B}^M \to \mathcal{B}^N \tag{8.37}$$

by linear (and if necessary continuous) extension of

$$S\_{M,N}(a\_{1/M}) = S\_N(a\_{1/M} \otimes 1\_B \otimes \cdots \otimes 1\_B) \ (a\_{1/M} \in \mathcal{B}^M),\tag{8.38}$$

with *<sup>N</sup>* <sup>−</sup> *<sup>M</sup>* copies of the unit 1*<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>* so as to obtain an element of *BN*. Clearly, *SN*,*<sup>N</sup>* <sup>=</sup> *SN*. In particular, *<sup>S</sup>*1,*<sup>N</sup>* : *<sup>B</sup>* <sup>→</sup> *BN* gives the average of *<sup>b</sup>* over *<sup>N</sup>* copies of *<sup>B</sup>*:

$$S\_{1,N}(b) = \frac{1}{N} \sum\_{k=1}^{N} 1\_B \otimes \dots \otimes b\_{(k)} \otimes 1\_B \dots \otimes 1\_B,\tag{8.39}$$

For example, take *B* = *Mn*(C) for simplicity, and pick some *a* = *a*<sup>∗</sup> ∈ *B* and λ ∈ σ(*a*), with associated spectral projection *e*<sup>λ</sup> . Putting *b* = *e*<sup>λ</sup> in (8.39) gives

$$f\_N^{(\lambda)} = \mathcal{S}\_{1,N}(e\_{\lambda}).\tag{8.40}$$

This is a *frequency operator*: applied to states of the kind <sup>υ</sup><sup>1</sup> ⊗···⊗ <sup>υ</sup>*<sup>N</sup>* <sup>∈</sup> (C*n*)*N*, where each υ*<sup>i</sup>* is an eigenstate of *a*, so that *a*υ*<sup>i</sup>* = λ*i*υ*<sup>i</sup>* for some λ*<sup>i</sup>* ∈ σ(*a*), the corresponding operator counts the relative frequency of λ in the list (λ1,...,λ*N*). The commutative case *B* = *C*(*X*) provides a classical analogue. Eq. (C.271) gives

$$B^N = \mathcal{C}(X)^N \cong \mathcal{C}(X^N),\tag{8.41}$$

so that, identifying elements of *BN* with functions on *<sup>X</sup>N*, for *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*) we have

$$S\_{1,N}(f)(\mathbf{x}\_1, \dots, \mathbf{x}\_N) = \frac{1}{N} \sum\_{k=1}^N (f(\mathbf{x}\_1) + \dots + f(\mathbf{x}\_N)).\tag{8.42}$$

We return to the construction of a continuous bundle of C\*-algebras with fibers (8.32) and (8.33). As in §8.1, we construct this bundle by specifying a preliminary family of continuous cross-sections and then using Proposition C.124 to finish.

Definition 8.2. *We say that a sequence* (*a*1/*N*)*N*∈N*, with a*1/*<sup>N</sup>* <sup>∈</sup> *BN, is* symmetric *when there exist M* <sup>∈</sup> <sup>N</sup> *and a*1/*<sup>M</sup>* <sup>∈</sup> *<sup>B</sup><sup>M</sup> such that for each N* <sup>≥</sup> *M one has*

$$a\_{1/N} = \mathcal{S}\_{M,N}(a\_{1/M}).\tag{8.43}$$

This implies *a*1/*<sup>M</sup>* = *SM*(*a*1/*M*). Symmetric sequences can start in any finite way they like, but their infinite tails consist of averaged observables. Hence *symmetric sequences asymptotically commute*: if (*a*1/*N*) and (*b*1/*N*) are symmetric, then

300 8 Limits: large *N*

$$\lim\_{N \to \infty} \|a\_{1/N}b\_{1/N} - b\_{1/N}a\_{1/N}\|\_{\mathcal{B}^N} = 0,\tag{8.44}$$

simply because the commutators of single-site operators are nonvanishing only at finitely many positions, upon which the factor 1/*N* in (8.39) guarantees (8.44).

For example, if *B* = *M*2(C), and (σ*i*) are the Pauli matrices, we have

$$[S\_{1,N}(\frac{1}{2}\hbar\sigma\_1), S\_{1,N}(\frac{1}{2}\hbar\sigma\_2)] = i\frac{\hbar}{N}S\_{1,N}(\frac{1}{2}\hbar\sigma\_3),\tag{8.45}$$

*et cetera*, showing that the averaged spin- <sup>1</sup> <sup>2</sup> operators effectively rescale *h*¯ by *h*¯/*N*.

In view of this, it is reasonable to expect that we may be able to assemble the algebra *BN* into a continuous bundle whose limit algebra at *N* = ∞ is commutative.

For each symmetric sequence (*a*1/*N*) we define a function *a*<sup>0</sup> : *S*(*B*) → C by

$$a\_0(\mathfrak{a}) = \lim\_{N \to \infty} \mathfrak{a}^N(a\_{1/N}),\tag{8.46}$$

where <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*), and <sup>ω</sup>*<sup>N</sup>* <sup>∈</sup> *<sup>S</sup>*(*BN*) is defined by linear (and continuous) extension of

$$\mathfrak{o}^N(b\_1 \otimes \cdots \otimes b\_N) = \mathfrak{o}(b\_1) \cdots \mathfrak{o}(b\_N);\tag{8.47}$$

continuity of <sup>ω</sup>*<sup>N</sup>* on the algebraic tensor product <sup>⊗</sup>*NB* (and hence extendibility to *A*1/*N*) is guaranteed by Proposition C.98, although this is not really needed here because *<sup>a</sup>*<sup>0</sup> only requires the values of <sup>ω</sup>*<sup>N</sup>* on <sup>⊗</sup>*NB* itself. In any case, the limit exists by definition of a symmetric sequence, from which we also see that *a*<sup>0</sup> ∈ *C*(*S*(*B*)), because it is a finite sum of finite products of the type ω(*b*1)···ω(*bM*), each of which is continuous in ω by definition of the *w*∗-topology on *S*(*B*).

For example, the frequency operators (8.40) define a symmetric sequence (*f* <sup>λ</sup> *<sup>N</sup>* )*N*∈N, whose the limit function *f* <sup>λ</sup> <sup>0</sup> : *S*(*B*) → C in the sense of (C.560) or (8.46) is

$$f\_0^{\vec{\lambda}}(\mathfrak{o}) = \mathfrak{o}(e\_{\vec{\lambda}}).\tag{8.48}$$

Thus (8.46) gives the Born probability for the outcome *a* = λ in the state ω; see §8.4. Classically, identifying elements of *S*(*C*(*X*)) with probability measures μ on *X*, the limit of the sequence *a*1/*<sup>N</sup>* = *S*1,*N*(*f*) for fixed *f* ∈ *C*(*X*), cf. (8.42), is

$$a\_0(\mu) = \int\_X d\mu f.\tag{8.49}$$

This convergence is an example of the strong law of large numbers, see §8.3.

We return to the general case.

Definition 8.3. *A sequence* (*a*1/*N*)*N*∈<sup>N</sup> *as above is* quasi-symmetric *if for each N* ∈ N *one has a*1/*<sup>N</sup>* = *SN*(*a*1/*N*) *and for any* ε > 0 *there is a symmetric sequence* (*a*˜1/*N*) *and some M* ∈ N *such that a*1/*<sup>N</sup>* −*a*˜1/*N* < ε *for all N* > *M.*

For example, if lim*N*→<sup>∞</sup> *a*1/*<sup>N</sup>* − *a*˜1/*N* = 0 for some fixed symmetric sequence (*a*˜1/*N*), then (*a*1/*N*)*N*∈<sup>N</sup> is obviously quasi-symmetric.

Theorem 8.4. *For any unital C\*-algebra B, the C\*-algebras* (8.32) *and* (8.33)*, i.e.,*

$$A\_0^{(c)} = \mathcal{C}(\mathcal{S}(B));\tag{8.50}$$

$$A\_{1/N}^{(c)} = \mathcal{B}^N,\tag{8.51}$$

*where B<sup>N</sup> is N-fold projective tensor power* <sup>⊗</sup><sup>ˆ</sup> *<sup>N</sup>* max*B, are the fibers of a continuous bundle A*(*c*) *of C\*-algebras over I* = (1/N)∪ {0} ≡ <sup>1</sup>/N˙ *whose continuous crosssections are the quasi-symmetric sequences* (*a*1/*N*) *with limit a*<sup>0</sup> *given by* (8.46)*.*

As in Theorem 8.1, also here we have a deformation quantization of *S*(*B*) in the sense of Definition 7.1, where the Poisson bracket on *S*(*B*) may be defined by specifying its value on linear function *<sup>b</sup>*<sup>ˆ</sup> <sup>∈</sup> *<sup>C</sup>*(*S*(*B*)), where *<sup>b</sup>* <sup>∈</sup> *<sup>B</sup>* and *<sup>b</sup>*ˆ(ω) = <sup>ω</sup>(*b*), by

$$\{\hat{a},\hat{b}\} = i[\bar{a},\bar{b}].\tag{8.52}$$

Unfortunately, this involves the theory of infinite-dimensional Poisson manifolds, which we prefer to omit. Thus we shall only prove Theorem 8.4 as stated.

The proof relies on Størmer's *quantum De Finetti Theorem* 8.6 below.

Definition 8.5. *Let B be a unital C\*-algebra. A state* ρ *on BN is called:*


*The set of all permutation-invariant states / K-exchangeable states / infinitely exchangeable states on BN is denoted by S*S*<sup>N</sup>* (*BN*) */ S*S*<sup>N</sup> <sup>K</sup>* (*BN*) */ S*S*<sup>N</sup>* <sup>∞</sup> (*BN*)*.*

Theorem 8.6. *Let B be a unital C\*-algebra. For any N* ∈ N *the correspondence* <sup>ω</sup>*<sup>N</sup>* <sup>↔</sup> <sup>ω</sup>*, where* <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*) *and* <sup>ω</sup>*<sup>N</sup>* <sup>∈</sup> *<sup>S</sup>*(*BN*)*, cf.* (8.47)*, gives a bijection*

$$
\partial\_c S^{\widetilde{\mathfrak{S}}\_N}\_{\infty}(B^N) \cong S(B). \tag{8.53}
$$

This theorem was originally stated (in the language of infinite tensor products) as Theorem 8.9 in §8.3, where it (and hence Theorem 8.6) will also be proved.

We also need a formula for the norm of any self-adjoint element *a* of any C\* algebra *A* in terms of the state space *A* and the pure state space *P*(*A*), viz.

$$\|a\| = \sup\{ |\mathcal{a}(a)| : \mathcal{a} \in \mathcal{S}(A) \} = \sup\{ |\mathcal{a}(a)|, \mathcal{a} \in P(A) \}.\tag{8.54}$$

This follows from Proposition C.15, the spectral radius formula (B.254), and compactness of σ(*a*), implying that the supremum in (B.254) is reached on σ(*a*).

*Proof.* The proof of Theorem 8.4 is quite similar to the proof of Theorem 8.1, in that we once again rely on Proposition C.124, where the symmetric sequences are going to play the role of *A*˜. To apply Proposition C.124, we should prove that:


$$\lim\_{N \to \infty} ||\tilde{a}\_{1/N}|| = ||\tilde{a}\_0||\_{\infty}.\tag{8.55}$$

To prove the first claim, we first note that ˜*a*<sup>0</sup> is the linear span of all finite products ω(*b*1)·ω(*bN*), where *N* ∈ N and *b*1,...,*bN* ∈ *B*. Since ω(*b*) = ω(*b*∗) this is obviously a <sup>∗</sup>-algebra. The monomials *<sup>b</sup>*ˆ(ω) = <sup>ω</sup>(*b*) already separate points of *<sup>S</sup>*(*B*) <sup>⊂</sup> *B*∗, since if ω = ω then clearly is there some *b* ∈ *B* for which (ω − ω )(*b*) = 0. Hence claim no. 1 follows from the Stone–Weierstrass Theorem B.51.

For the second, let (*a*˜1/*N*) be a symmetric sequence. Since there are *M* ∈ N and *<sup>a</sup>*˜1/*<sup>M</sup>* <sup>∈</sup> *<sup>B</sup><sup>M</sup>* with ˜*a*1/*<sup>M</sup>* <sup>=</sup> *SM*(*a*˜1/*M*) and ˜*a*1/(*M*+*K*) <sup>=</sup> *SM*,*M*+*K*(*a*˜1/*M*) for all *<sup>K</sup>* <sup>∈</sup> <sup>N</sup>,

$$\begin{aligned} \|\tilde{a}\_{1/M}\| &= \sup\{ |\rho(\tilde{a}\_{1/M})| : \rho \in S(\mathcal{B}^M) \} = \sup\{ |\rho(\tilde{a}\_{1/M})| : \rho \in S^{\mathfrak{S}\_M}(\mathcal{B}^M) \}; \\ \|\tilde{a}\_{1/(M+K)}\| &= \sup\{ |\rho(S\_{M,M+K}(\tilde{a}\_M))| : \rho \in S^{\mathfrak{S}\_{M+K}}(\mathcal{B}^{M+K}) \} \\ &= \sup\{ |\rho(\tilde{a}\_{1/M})| : \rho \in S^{\mathfrak{S}\_M}\_K(\mathcal{B}^M) \}, \end{aligned}$$

where we used (8.54) and (8.43). Theorem 8.6 and (8.46) then yield (8.55):

$$\begin{split} \lim\_{N \to \infty} \|\tilde{a}\_{1/N}\| &= \lim\_{K \to \infty} \|\tilde{a}\_{1/(M+K)}\| \\ &= \sup \{ |\rho(\tilde{a}\_{1/M})| : \rho \in S^{\mathfrak{S}\_M}\_{\infty}(\mathcal{B}^M) \} \\ &= \sup \{ |\rho(\tilde{a}\_{1/M})| : \rho \in \partial\_{\varepsilon} S^{\mathfrak{S}\_M}\_{\infty}(\mathcal{B}^M) \} = \sup \{ |\alpha^M(\tilde{a}\_{1/M})| : \alpha \in S(B) \} \\ &= \sup \{ |\lim\_{N \to \infty} \alpha^N(\tilde{a}\_{1/N})| : \alpha \in S(B) \} = \sup \{ |\tilde{a}\_0(\alpha)| : \alpha \in S(B) \} \\ &= \|\tilde{a}\_0\|\_{\infty} \end{split}$$

The proof that the sequences (*a*1/*N*) for which condition (C.552) in Proposition C.124 holds are precisely the approximately symmetric sequences is the same as the proof of the equivalence of the two conditions in Lemma C.125, taking *h*¯ <sup>0</sup> = 0.

Finally, it is easy to show that the limit (8.46) exists also for quasi-symmetric observables *a*: take ε > 0 and find ˜*a* and *M* as in Definition 8.3. For this ˜*a*, let *M*<sup>0</sup> be such that (8.43) holds (with *M M*0). For all *N*,*N* greater than both *M* and *M*0,

$$\begin{split} |\mathfrak{o}^{N}(a\_{1/N}) - \mathfrak{o}^{N'}(a\_{1/N'})| &\leq |\mathfrak{o}^{N}(a\_{1/N} - \tilde{a}\_{1/N}) - \mathfrak{o}^{N'}(a\_{1/N'} - \tilde{a}\_{1/N'})| \\ &+ |\mathfrak{o}^{N}(\tilde{a}\_{1/N}) - \mathfrak{o}^{N'}(\tilde{a}\_{1/N'})| \\ &\leq ||a\_{1/N} - \tilde{a}\_{1/N}|| + ||a\_{1/N'} - \tilde{a}\_{1/N'}|| + 0 \\ &< 2\varepsilon, \end{split} \tag{8.56}$$

since ω*N* <sup>=</sup> 1. Hence (ω*N*(*a*1/*N*)) is a Cauchy sequence (in <sup>C</sup>). -

Our second continuous bundle of C\*-algebras of interest is described by the following changes in Definitions 8.2 and 8.3.

Definition 8.7. *Let B be a unital C\*-algebra and let a*1/*<sup>N</sup>* <sup>∈</sup> *<sup>B</sup><sup>N</sup> for each N* <sup>∈</sup> <sup>N</sup>*.*

• *A sequence* (*a*1/*N*)*N*∈<sup>N</sup> *is called* local *when there exist M* <sup>∈</sup> <sup>N</sup> *and a*1/*<sup>M</sup>* <sup>∈</sup> *BM such that for each N* ≥ *M one has*

$$a\_{1/N} = a\_{1/M} \otimes \mathbb{I}\_B \otimes \cdots \otimes \mathbb{I}\_B,\tag{8.S7}$$

*with N* <sup>−</sup> *M copies of the unit* <sup>1</sup>*<sup>B</sup>* <sup>∈</sup> *B (so that indeed a*1/*<sup>N</sup>* <sup>∈</sup> *<sup>B</sup>N).*

• *A sequence* (*a*1/*N*)*N*∈<sup>N</sup> *is* quasi-local *if for any* ε > 0 *there is a local sequence* (*a*˜1/*N*) *and some M* ∈ N *such that a*1/*<sup>N</sup>* −*a*˜1/*N* < ε *for all N* > *M.*

For the right analogue of Theorem 8.4 we recall the description of the infinite tensor product *<sup>B</sup>*∞; cf. §C.14, especially the explanation preceding (C.315). Accordingly, a dense subspace of *<sup>B</sup>*<sup>∞</sup> is given by equivalence classes of local sequences (*a*1/*N*)*N*∈<sup>N</sup> under the equivalence relation *a* ∼ *a* iff lim*N*→<sup>∞</sup> *a*1/*<sup>N</sup>* −*a* <sup>1</sup>/*N* = 0; the C\*-algebraic operations in *B*<sup>∞</sup> are inherited from the *BN*, and if we denote the equivalence class of (*a*1/*N*)*<sup>N</sup>* by [*a*1/*N*]*N*, the norm in *B*<sup>∞</sup> is given by

$$\|\|[a\_{1/N}]\_N\|\| = \lim\_{N \to \infty} \|a\_{1/N}\|\,. \tag{8.58}$$

By construction, this number is independent of the representative (*a*1/*N*)*<sup>N</sup>* in the class [*a*1/*N*]*N*. By definition, *B*<sup>∞</sup> is the completion of the space of these equivalence classes in the norm (8.58). As explained after (C.315), for each *M* ∈ N we have an injective (and hence isometric) homomorphism <sup>ϕ</sup>*<sup>M</sup>* : *<sup>B</sup><sup>M</sup>* <sup>→</sup> *<sup>B</sup>*<sup>∞</sup> that maps *<sup>a</sup>*1/*<sup>M</sup>* <sup>∈</sup> *BM* to the equivalence class [*a*1/*N*]*<sup>N</sup>* of the sequence (*a*1/*N*)*<sup>N</sup>* defined by

$$a\_{1/N} = 0, \ (N < M);\tag{8.59}$$

$$a\_{1/N} = a\_{1/M}, \ (N = M);\tag{8.60}$$

$$a\_{1/(M+K)} = a\_{1/M} \otimes 1\_B \otimes \cdots \otimes 1\_B,\ (K > 0),\tag{8.61}$$

with *K* copies of 1*B*. It is easy to verify that one might as well have started from quasi-local sequences and their equivalence classes, for which the limit (8.58) exists by an argument similar to (8.56). In that case the ensuing C\*-algebra is already complete, which leads to a direct description of the elements of *B*<sup>∞</sup> as equivalence classes of quasi-local sequences. This fact also follows from the following analogue of Theorem 8.4, which may be proved in the same way, i.e., from Proposition C.124, where this time the elements of *A*˜ are local sequences rather than symmetric ones (in fact, the proof is much easier, since this time we obtain (C.552) for free):

Theorem 8.8. *For any unital C\*-algebra B, the C\*-algebras* (8.32) *and* (8.34)*, i.e.,*

$$A\_0^{(q)} = B^{\circ \circ};\tag{8.62}$$

$$A\_{1/N}^{(q)} = \mathcal{B}^N,\tag{8.63}$$

*are the fibers of a continuous bundle A*(*q*) *of C\*-algebras over I* = 1/N˙ *whose continuous cross-sections are the quasi-local sequences* (*a*1/*N*) *with limit a*<sup>0</sup> = [*a*1/*N*]*N.*

#### 8.3 Quantum de Finetti Theorem

As an initial step in exploring the connection between the bundles *A*(*c*) and *A*(*q*) we prove Theorem 8.6, which we first restate in an equivalent form. Let <sup>S</sup><sup>∞</sup> be the group of bijections of N that differ from the identity only on a finite set. Each such finite permutation *<sup>p</sup>* <sup>∈</sup> <sup>S</sup><sup>∞</sup> defines a map <sup>α</sup>*<sup>p</sup>* : *<sup>B</sup>*<sup>∞</sup> <sup>→</sup> *<sup>B</sup>*∞, as follows. Let *<sup>S</sup>* <sup>⊂</sup> <sup>N</sup> be the finite subset of N on which *p* acts nontrivially (if *S* = 0 we have / *p* = idN, in which case also α*<sup>p</sup>* = id*B*<sup>∞</sup> , see below). Take a local sequence (*a*1/*N*)*N*, so that (8.57) holds, in which we may assume *M* ≥ max*S*; we also redefine *a*1/*<sup>N</sup>* = 0 for each *N* < *M*. For each *<sup>N</sup>* <sup>≥</sup> *<sup>M</sup>* <sup>≥</sup> max*S*, the map *<sup>p</sup>* may be regarded as an element *pN* of <sup>S</sup>*<sup>N</sup>* by restriction to {1,...,*N*} ⊂ <sup>N</sup> and hence *<sup>p</sup>* acts on *BN* by permuting the entries in elementary tensor products of operators, cf. (8.35). For each *<sup>p</sup>* <sup>∈</sup> S∞, define a map

$$
\alpha\_p: B^{\curvearrowright} \to B^{\curvearrowright};\tag{8.64}
$$

$$\mathfrak{a}\_p([a\_{1/N}]\_N) = [\mathfrak{a}\_p^{(N)}(a\_{1/N})]\_N. \tag{8.65}$$

This uses a specific representative of the equivalence class [*a*1/*N*]*<sup>N</sup>* <sup>∈</sup> *<sup>B</sup>*∞, but nonetheless the map α*<sup>p</sup>* is well defined. Furthermore, since each α(*N*) *<sup>p</sup>* : *BN* <sup>→</sup> *<sup>B</sup><sup>N</sup>* is an automorphism (i.e., an invertible homomorphism), it is an isometry, so that also α*<sup>p</sup>* is an isometry on its domain and hence extends to an automorphism of *B*∞. The ensuing map *<sup>p</sup>* → <sup>α</sup>*<sup>p</sup>* from <sup>S</sup><sup>∞</sup> to the group Aut(*B*∞) of all automorphisms of *<sup>B</sup>*<sup>∞</sup> is a homomorphism of groups, and we say that <sup>S</sup><sup>∞</sup> is an *automorphism group* of *<sup>B</sup>*∞.

Writing *<sup>S</sup>*S<sup>∞</sup> (*B*∞) for the set of all <sup>S</sup>∞-invariant states on *<sup>B</sup>*∞, i.e., <sup>ρ</sup> <sup>∈</sup> *<sup>S</sup>*S<sup>∞</sup> (*B*∞) iff <sup>ρ</sup> ◦α*<sup>p</sup>* <sup>=</sup> <sup>ρ</sup> for each *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>∞, we may now rephrase Theorem 8.6 as follows:

Theorem 8.9. *Let B be a unital C\*-algebra. There is a bijection*

$$
\partial\_{\epsilon} S^{\widetilde{\mathsf{S}}^{\omega} \ast} (B^{\infty}) \cong S(B), \tag{8.66}
$$

*given by* <sup>ω</sup><sup>∞</sup> <sup>↔</sup> <sup>ω</sup>*, where* <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*)*, and* <sup>ω</sup><sup>∞</sup> <sup>∈</sup> *<sup>S</sup>*(*B*∞) *is defined by, cf.* (8.47)*,*

$$\mathfrak{o}^{\infty}([a\_{1/N}]\_N) = \lim\_{N \to \infty} \mathfrak{o}^N(a\_{1/N}).\tag{8.67}$$

This is essentially the same as Theorem 8.6: for any *<sup>M</sup>* <sup>∈</sup> <sup>N</sup>, a state on *BM* is infinitely exchangeable iff it is the restriction of an element of *<sup>S</sup>*S<sup>∞</sup> (*B*∞) to *BM* <sup>⊂</sup> *<sup>B</sup>*∞, where the inclusion is given by the map ϕ*<sup>M</sup>* defined below (8.58).

*Proof.* Let *<sup>S</sup>*(*B*) <sup>⊂</sup> *<sup>S</sup>*S<sup>∞</sup> (*B*∞) under the map <sup>ω</sup> → <sup>ω</sup>∞. We first show the inclusion

$$
\partial\_c S^{\mathfrak{S}\_{\omega}}(B^{\circ \circ}) \subseteq S(B) \tag{8.68}
$$

contrapositively, i.e., if <sup>ρ</sup> <sup>∈</sup> *<sup>S</sup>*S<sup>∞</sup> (*B*∞) does not lie in *<sup>S</sup>*(*B*), then <sup>ρ</sup> has a nontrivial convex decomposition in *<sup>S</sup>*S<sup>∞</sup> (*B*∞). We identify *BN* with <sup>ϕ</sup>*N*(*BN*) <sup>⊂</sup> *<sup>B</sup>*<sup>∞</sup> and denote the restriction of <sup>ρ</sup> to *BN* by <sup>ρ</sup>*N*. If <sup>ρ</sup> <sup>=</sup> <sup>ω</sup><sup>∞</sup> for some <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*), then

$$
\mathfrak{p}\_{M+K}(a'\_{1/M}\otimes a'\_{1/K}) = \mathfrak{p}\_M(a'\_{1/M})\mathfrak{p}\_K(a'\_{1/K}),\tag{8.69}
$$

for each *a* <sup>1</sup>/*<sup>M</sup>* <sup>∈</sup> *<sup>B</sup><sup>M</sup>* and *<sup>a</sup>* <sup>1</sup>/*<sup>K</sup>* <sup>∈</sup> *<sup>B</sup>K*. If (8.69) holds whenever 0 <sup>≤</sup> *<sup>a</sup>* <sup>1</sup>/*<sup>M</sup>* ≤ 1*B<sup>M</sup>* , then by Lemma C.53 and (C.8) it always holds. Adding suitable multiples of the unit and rescaling, it follows that if (8.69) holds whenever

$$a\_{\overline{3}} \cdot 1\_{B^M} \le a\_{1/M}' \le \frac{2}{\overline{3}} \cdot 1\_{B^M};\tag{8.70}$$

then it always holds. Therefore, if (8.69) fails, then it fails for some *a* <sup>1</sup>/*<sup>M</sup>* satisfying (8.70) and some and *a* <sup>1</sup>/*K*, in which case <sup>1</sup> <sup>3</sup> ≤ ρ*M*(*a* <sup>1</sup>/*M*) ≤ <sup>2</sup> <sup>3</sup> . However, such a failure implies the existence of a nontrivial convex decomposition

$$
\mathfrak{p} = t\mathfrak{p}' + (1 - t)\mathfrak{p}'',\tag{8.71}
$$

with *t* = ρ*M*(*a* <sup>1</sup>/*M*), and the functionals <sup>ρ</sup> and <sup>ρ</sup> on *<sup>B</sup>*<sup>∞</sup> are defined by

$$\mathfrak{p}'([a\_{1/N}]\_N) = \lim\_{N \to \infty} \mathfrak{p}\_{M+N}(a\_{1/M}' \otimes a\_{1/N}) / \mathfrak{p}\_M(a\_{1/M}');\tag{8.72}$$

$$\mathfrak{p}''([a\_{1/N}]\_N) = \lim\_{N \to \infty} \mathfrak{p}\_{M+N}((1\_{\mathcal{B}^N} - a'\_{1/M}) \otimes a\_{1/N}) / \mathfrak{p}\_M(1\_{\mathcal{B}^M} - a'\_{1/M}). \tag{8.73}$$

These limits exist on symmetric sequences (where they stabilize), and hence they exists in general. Furthermore, since ρ*M*(1*B<sup>M</sup>* − *a* <sup>1</sup>/*M*) = 1 −*t*, the property (8.71) is obvious. Both ρ and ρ belong to *S*S<sup>∞</sup> (*B*∞), since each functional ρ*M*+*<sup>N</sup>* is an element of *S*S*M*+*<sup>N</sup>* (*BMN* ). Finally, (8.71) is nontrivial, since if ρ = ρ, then ρ *<sup>K</sup>* = ρ *K*, and hence (8.69) would hold (whose violation we assumed). This proves (8.68).

Though it is always true, for simplicity we prove the converse inclusion

$$S(B) \subseteq \partial\_{\epsilon} S^{\widehat{\mathfrak{S}}^{\omega}}(B^{\circ \circ}) \tag{8.74}$$

just for the case where *B* is generated by projections, as in the case *B* = *Mn*(C), *B* = *B*(*H*), or *B* a von Neumann algebra, or more generally an AW\*-algebra (see §C.24). In that case also each *BN* is generated by its projections.

For each <sup>ρ</sup> <sup>∈</sup> *<sup>S</sup>*S<sup>∞</sup> (*B*∞), each *<sup>N</sup>* <sup>∈</sup> <sup>N</sup>, and each projection *<sup>e</sup>* <sup>∈</sup> *BN*, we have

$$\left(\mathfrak{p}\_{\mathsf{N}}(e)\right)^{2} \leq \mathfrak{p}\_{2\mathsf{N}}(e \odot e),\tag{8.75}$$

see below. Assuming (8.75), suppose <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*) and <sup>ω</sup><sup>∞</sup> <sup>=</sup> *<sup>t</sup>*<sup>ρ</sup> + (1−*t*)<sup>ρ</sup> for some *t* ∈ (0,1) and ρ ,<sup>ρ</sup> <sup>∈</sup> *<sup>S</sup>*S<sup>∞</sup> (*B*∞). Since <sup>ω</sup><sup>∞</sup> *<sup>N</sup>* = ω*N*, we then have

$$\begin{split} \mathfrak{o} \mathfrak{o}^{N}(e)^{2} &= \left(t\mathfrak{p}\_{N}^{\prime}(e) + (1-t)\mathfrak{p}\_{N}^{\prime\prime}(e)\right)^{2} = \left\langle \left(\frac{\sqrt{t}}{\sqrt{1-t}}\right), \left(\frac{\mathfrak{p}\_{N}^{\prime}(e)\sqrt{t}}{\mathfrak{p}\_{N}^{\prime\prime}(e)\sqrt{1-t}}\right) \right\rangle^{2} \\ &\leq \left\langle \left(\frac{\sqrt{t}}{\sqrt{1-t}}\right), \left(\frac{\sqrt{t}}{\sqrt{1-t}}\right) \right\rangle \cdot \left\langle \left(\frac{\mathfrak{p}\_{N}^{\prime\prime}(e)\sqrt{t}}{\mathfrak{p}\_{N}^{\prime\prime}(e)\sqrt{1-t}}\right), \left(\frac{\mathfrak{p}\_{N}^{\prime\prime}(e)\sqrt{t}}{\mathfrak{p}\_{N}^{\prime\prime}(e)\sqrt{1-t}}\right) \right\rangle \\ &= t\mathfrak{p}\_{N}^{\prime}(e)^{2} + (1-t)\mathfrak{p}\_{N}^{\prime\prime}(e)^{2} \\ &\leq t\mathfrak{p}\_{N}^{\prime}(e \otimes e) + (1-t)\mathfrak{p}\_{2N}^{\prime\prime}(e \otimes e) \\ &= \mathfrak{o}^{2N}(e \otimes e) = \mathfrak{o}^{N}(e)^{2}, \end{split}$$

where the inner product in the first line is the usual one in R2, and, noting it is positive, we have used the Cauchy–Schwarz inequality for this inner product, as well as (8.75). Hence both inequalities must be equalities, and for the first one this implies ρ *<sup>N</sup>*(*e*) = ρ *<sup>N</sup>*(*e*). Since this is true for all *N* and all projections in *BN*, this implies <sup>ρ</sup> <sup>=</sup> <sup>ρ</sup> <sup>=</sup> <sup>ω</sup>∞, so that <sup>ω</sup><sup>∞</sup> <sup>∈</sup> <sup>∂</sup>*eS*S<sup>∞</sup> (*B*∞), and (8.74) has been established, up to the proof of (8.75). To this effect, note for each *M* ∈ N and *t* ∈ R we have

$$\mathfrak{p}\_{\mathsf{MN}}((1\_{\mathsf{B}^{\mathsf{N}}} \otimes \cdots \otimes 1\_{\mathsf{B}^{\mathsf{N}}} \otimes e + \cdots + e \otimes 1\_{\mathsf{B}^{\mathsf{N}}} \otimes \cdots \otimes 1\_{\mathsf{B}^{\mathsf{N}}} + t \cdot 1\_{\mathsf{B}^{\mathsf{M}\mathsf{N}}})^2) \tag{8.76}$$

$$=M(M-1)\mathfrak{p}\_{2N}(e\otimes e) + M\mathfrak{p}\_N(e) + 2tM\mathfrak{p}\_N(e) + t^2,\tag{8.77}$$

with *M* − 1 copies of 1*B<sup>N</sup>* and *e* moving from right to left in the first line, leaving *M* terms before the final one *t* · 1*BMN* in (8.76). In working out the square in (8.76) and moving to the second line we used *e*<sup>2</sup> = *e* as wel as permutation invariance of the state ρ*MN*. The point is that (8.76) is positive, so that (8.77) must be positive, too, for all *M* ∈ N and *t* ∈ R. Now a function *f*(*t*) = *t* <sup>2</sup> <sup>+</sup>2*bt* <sup>+</sup>*<sup>c</sup>* = (*<sup>t</sup>* <sup>+</sup>*b*)<sup>2</sup> <sup>−</sup>*b*<sup>2</sup> <sup>+</sup>*<sup>c</sup>* obviously satisfies *<sup>f</sup>*(*t*) <sup>≥</sup> 0 for each *<sup>t</sup>* iff *<sup>b</sup>*<sup>2</sup> <sup>≤</sup> *<sup>c</sup>*, so that (8.76) is positive for all *<sup>t</sup>* iff

$$M^2 \mathfrak{p}\_N(e)^2 \le M(M-1)\mathfrak{p}\_{2N}(e \otimes e) + M\mathfrak{p}\_N(e).$$

Letting *M* → ∞ gives (8.75). -

Taking *B* = *C*(*X*) for some compact Hausdorff space *X*, in view of (8.41) the situation may be transferred to the Cartesian product *XN*, equipped with the product topology (which is generated by products *<sup>A</sup>*<sup>1</sup> ×···× *AN* <sup>⊂</sup> *<sup>X</sup><sup>N</sup>* with each *Ai* <sup>⊂</sup> *<sup>X</sup>* open) and the ensuing Borel σ-algebra (generated by the above products with each *Ai* Borel). If μ1,...,μ*<sup>N</sup>* are (probability) measures on *X* (in which case we write μ*<sup>i</sup>* ∈ Pr(*X*)), then there is a unique (probability) measure μ<sup>1</sup> ×···× μ*<sup>N</sup>* whose value on a product as above is equal to μ1(*A*1)···μ*N*(*AN*). In particular, any probability measure <sup>μ</sup> <sup>∈</sup> Pr(*X*) on *<sup>X</sup>* defines a probability measure <sup>μ</sup>*<sup>N</sup>* on *<sup>X</sup>N*.

The symmetric group <sup>S</sup>*<sup>N</sup>* acts on *<sup>X</sup><sup>N</sup>* in the obvious way, and hence its acts on the power set P(*XN*). We call the latter action σ(*N*) , so that for *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>*<sup>N</sup>* we have

$$
\sigma\_p^{(N)}(A\_1 \times \dots \times A\_N) = A\_{p(1)} \times \dots \times A\_{p(N)}.\tag{8.78}
$$

The Cartesian product *<sup>X</sup>*<sup>∞</sup> <sup>≡</sup> *<sup>X</sup>*<sup>N</sup> is well defined both topologically and measuretheoretically (the topology is generated by all products ∏*<sup>i</sup> Ai* with finitely many *Ai* open and different from *X*, and likewise for the Borel structure), and the infinite symmetric group <sup>S</sup><sup>∞</sup> <sup>=</sup> <sup>∪</sup>*N*S*<sup>N</sup>* acts on it in the obvious way, in that *<sup>p</sup>* <sup>∈</sup> <sup>S</sup>*<sup>N</sup>* <sup>⊂</sup> <sup>S</sup><sup>∞</sup> permutes the first *N* coordinates. Specializing Definition 8.5 to *B* =*C*(*X*), we obtain:

Definition 8.10. *A probability measure* ν*<sup>N</sup> on X<sup>N</sup> is called:*


$$\square$$

*A probability measure* <sup>ν</sup><sup>∞</sup> *on X*<sup>∞</sup> *is called* permutation-invariant *if* <sup>ν</sup><sup>∞</sup> ◦σ(*N*) *<sup>p</sup>* = ν<sup>∞</sup> *for any p* <sup>∈</sup> <sup>S</sup>*<sup>N</sup> and N* <sup>∈</sup> <sup>N</sup>*, where* <sup>σ</sup>(*N*) *<sup>p</sup> acts on* ∏*<sup>i</sup> Ai by* (8.78) *on the first N factors A*1,...,*AN whilst acting trivally on all remaining Ai's.*

The connection between the two parts of this definition is that ν*<sup>N</sup>* is exchangeable iff it is the restriction to *X<sup>N</sup>* of some permutation-invariant measure ν<sup>∞</sup> on *X*∞.

From Theorems 8.6 and 8.3 we obtain the *Hewitt–Savage Theorem*:

Corollary 8.11. *Let X be a compact Hausdorff space. For any N* ∈ N*, any infinitely exchangeable probability measure* ν*<sup>N</sup> on X<sup>N</sup> takes the form*

$$\mathcal{V}\_N = \int\_{\text{Pr}(X)} dP(\mu) \,\mu^N \,\tag{8.79}$$

*for some probability measure P on* Pr(*X*) *that is uniquely determined by* ν*N, and similarly for N* = ∞*, where* ν<sup>∞</sup> *is a permutation-invariant probability measure.*

The two claims in the theorem are equivalent by the remark after Definition 8.10.

The probability measure *P* ∈ Pr(Pr(*X*)) has the following interpretation. For *N* ∈ <sup>N</sup> and (*x*1,..., *xN*) <sup>∈</sup> *<sup>X</sup>N*, define the so-called *empirical measure <sup>E</sup>*(*x*1,...,*xN*) *<sup>N</sup>* on *X* as

$$E\_N^{(\mathbf{x}\_1,\ldots,\mathbf{x}\_N)} = \frac{1}{N} \sum\_{l=1}^N \mathfrak{S}\_{\mathbf{x}\_l},\tag{8.80}$$

where δ*<sup>x</sup>* is the Dirac measure on *X*. Seen as a map on *C*(*X*), this is the same as

$$\int\_{X} dE\_{N}^{(\mathbf{x}\_{1},...,\mathbf{x}\_{N})} f = \frac{1}{N} \sum\_{l=1}^{N} f(\mathbf{x}\_{l}).\tag{8.81}$$

Given a probability measure ν*<sup>N</sup>* on *XN*, these formulae give a random probability measure on *X* depending on a drawing from *XN*, i.e., a map

$$E\_N: X^N \to \Pr(X);\tag{8.82}$$

$$\mathbf{x}(\mathbf{x}\_1, \dots, \mathbf{x}\_N) \mapsto E\_N^{(\mathbf{x}\_1, \dots, \mathbf{x}\_N)}.\tag{8.83}$$

Proposition 8.12. *The probability measure P in Corollary 8.11 is given by*

$$\lim\_{N \to \infty} \int\_{\text{Pr}(X)} dP\_N F = \int\_{\text{Pr}(X)} dP F,\tag{8.84}$$

*for each F* ∈ *C*(Pr(*X*)) *(that is, P* = lim*N*→<sup>∞</sup> *PN weakly), where PN* ∈ Pr(Pr(*X*)) *is the probability measure on* Pr(*X*) *defined by* <sup>ν</sup>*<sup>N</sup>* <sup>∈</sup> Pr(*XN*) *and* (8.82) *-* (8.83)*, i.e.,*

$$P\_N(A) = \mathbb{V}\_N(E\_N^{-1}(A)) \ (A \subset \Pr(X)). \tag{8.85}$$

*Proof.* By the Stone–Weierstrass Theorem it suffices to prove (8.84) for linear combinations of monomials like *F*(μ) = μ(*f*1)···μ(*fK*), where *f*1,..., *fK* ∈ *C*(*X*) are arbitrary and μ(*f*) = *<sup>X</sup> d*μ *f* . This is a simple computation: using (8.85), we have

$$\begin{aligned} \int\_{\text{Pr}(X)} dP\_N F &= \int\_{X^N} d\mathbf{v}\_N(\mathbf{x}\_1, \dots, \mathbf{x}\_N) F(E\_N^{(\mathbf{x}\_1, \dots, \mathbf{x}\_N)}) \\ &= \int\_{X^N} d\mathbf{v}\_N(\mathbf{x}\_1, \dots, \mathbf{x}\_N) \prod\_{j=1}^K \left( \frac{1}{N} \sum\_{i=1}^N f\_j(\mathbf{x}\_i) \right) \\ &= \int\_{\text{Pr}(X)} dP(\boldsymbol{\mu}) \int\_{X^N} d\boldsymbol{\mu}^N(\mathbf{x}\_1, \dots, \mathbf{x}\_N) \prod\_{j=1}^K \left( \frac{1}{N} \sum\_{i=1}^N f\_j(\mathbf{x}\_i) \right), \end{aligned}$$

where in the third step we used (8.79). The result follows, since clearly

$$\begin{aligned} \lim\_{N \to \infty} \int\_{\text{Pr}(X)} d\mathcal{P}(\mu) \int\_{X^N} d\mu^N(\mathbf{x}\_1, \dots, \mathbf{x}\_N) \prod\_{j=1}^K \left( \frac{1}{N} \sum\_{l=1}^N f\_j(\mathbf{x}\_l) \right) &= \\ \int\_{\text{Pr}(X)} d\mathcal{P}(\mu) \int\_X d\mu(\mathbf{x}\_1) f\_1(\mathbf{x}\_1) \dotsm \int\_X d\mu(\mathbf{x}\_K) f\_k(\mathbf{x}\_K) &= \int\_{\text{Pr}(X)} d\mathcal{P}F. \quad \bigtriangleup \end{aligned}$$

We can also say more about the limit of the sum (8.81), So far, we have been dealing with the Borel <sup>σ</sup>-algebras <sup>B</sup>*<sup>N</sup>* <sup>⊂</sup> <sup>P</sup>(*XN*) and <sup>B</sup><sup>∞</sup> <sup>⊂</sup> <sup>P</sup>(*X*∞) generated by the topology (i.e., by the open sets). On top of this, consider S*<sup>N</sup>* ⊂ B*N*, defined as the σ-algebra generated by the permutation-invariant Borel subsets of *XN*, or, equivalently, as the smallest σ-algebra for which the permutation-invariant Borel measurable functions on *<sup>X</sup><sup>N</sup>* are measurable. Likewise, <sup>S</sup><sup>∞</sup> <sup>⊂</sup> <sup>B</sup>∞; regarding *<sup>A</sup>* <sup>⊂</sup> *<sup>X</sup><sup>N</sup>* as a subset *<sup>A</sup>*<sup>×</sup> <sup>∏</sup>*K*>*<sup>N</sup> <sup>X</sup>* of *<sup>X</sup>*∞, we have <sup>S</sup><sup>∞</sup> <sup>=</sup> <sup>∩</sup>*N*∈NS*N*. For any permutationinvariant probability measure ν*<sup>N</sup>* on *XN*, the Hilbert space *L*2(*X*,S*N*,ν*N*) is a closed subspace of *L*2(*XN*,B*N*,ν*N*), and the associated conditional expectation

$$L\_{\left(\mathcal{P}\_{\mathcal{N}}, \mathbf{v}\_{\mathcal{N}}\right)}: L^2(X^N, \mathcal{A}\_{\mathcal{N}}, \mathbf{v}\_{\mathcal{N}}) \to L^2(X, \mathcal{P}\_{\mathcal{N}}, \mathbf{v}\_{\mathcal{N}}) \tag{8.86}$$

is defined as the corresponding orthogonal projection. Since *<sup>C</sup>*(*XN*) <sup>⊂</sup> *<sup>L</sup>*2(*XN*), this map restricts to *<sup>C</sup>*(*XN*). Similarly for *<sup>N</sup>* <sup>=</sup> <sup>∞</sup>. For each *<sup>N</sup>* <sup>∈</sup> <sup>N</sup>, and also for *<sup>N</sup>* <sup>=</sup> <sup>∞</sup>, we may regard *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*) as a function *fK* on *<sup>X</sup><sup>N</sup>* through

$$f\_{\mathbf{K}}(\mathbf{x}\_1, \dots, \mathbf{x}\_N) = f(\mathbf{x}\_K) \text{ } \mathbf{K} = 1, \dots, N. \tag{8.87}$$

Proposition 8.13. *Let* ν<sup>∞</sup> *be a permutation-invariant probability measure on X*∞*, with restriction* <sup>ν</sup>*<sup>N</sup> to XN. Recall* (8.42)*. For any f* <sup>∈</sup> *<sup>C</sup>*(*X*) *we have* pointwise*:*

$$\mathcal{S}\_{1,\mathcal{N}}(f) = E\_{\left(\mathcal{P}\_{\mathcal{N}},\mathcal{V}\_{\mathcal{N}}\right)}(f\_1), \text{ } \mathbb{V}\_{\mathcal{N}}\text{-almost surely},\tag{8.88}$$

$$\lim\_{N \to \infty} S\_{\mathbb{I}, N}(f) = E\_{\left(\mathcal{F}\_{\bullet}, \mathbb{V}\_{\bullet}\right)}(f\_{\mathbb{I}}), \text{ } \mathbb{V}\_{\infty}\text{-almost surely},\tag{8.89}$$

*where the left-hand sides of* (8.88) *and* (8.89) *are functions on X<sup>N</sup> and X*∞*, respectively. Furthermore, if* <sup>ν</sup><sup>∞</sup> <sup>=</sup> <sup>μ</sup><sup>∞</sup> *for some* <sup>μ</sup> <sup>∈</sup> Pr(*X*)*, then pointwise on X*∞*,*

$$\lim\_{N \to \infty} S\_{1,N}(f) = \int\_X d\mu \, f, \text{ } \mu^{\text{ov}}\text{-almost surely } (f \in \mathcal{C}(X)). \tag{8.90}$$

Equivalently, if *<sup>L</sup>*<sup>μ</sup> <sup>⊂</sup> *<sup>X</sup>*<sup>∞</sup> is the set of infinite sequences (*x*1,*x*2,...) in *<sup>X</sup>*<sup>∞</sup> for which the limit in (8.90) exists for each *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*) and equals *<sup>X</sup> d*μ *f* , then

$$
\mu^{\infty}(L\_{\mu}) = 1.\tag{8.91}
$$

*Proof.* Eq. (8.88) is almost trivial, since *S*1,*N*(*f*) is permutation invariant and hence already lies in *L*2(*X*,S*N*,ν*N*), so that the equality just expresses the projection property *E*<sup>2</sup> (S*N*,ν*N*) = *E*(S*N*,ν*N*). Eq. (8.89) follows from the ergodic theorem, applied to the probability space (*X*∞,B∞,ν∞), the *unilateral shift*

$$T: (\mathfrak{x}\_1, \mathfrak{x}\_2, \ldots) \mapsto (\mathfrak{x}\_2, \mathfrak{x}\_3, \ldots),$$

and the random variable *f*<sup>1</sup> defined by *f* ∈*C*(*X*) via (8.87). Since ν<sup>∞</sup> is permutation invariant, it is also *<sup>T</sup>*-invariant (in the sense that <sup>ν</sup>∞(*<sup>T</sup>* <sup>−</sup>1(*A*)) = <sup>ν</sup>∞(*A*) for any *<sup>A</sup>* <sup>⊂</sup> *B*∞). This follows either directly, where one has to realize firstly that

$$T^{-1}(A\_1 \times A\_2 \times \cdots \times A\_n \times \cdots) = X \times A\_1 \times A\_2 \times \cdots \times \cdots \times A\_n \times \cdots,$$

and secondly that B<sup>∞</sup> is generated by products ∏*<sup>i</sup> Ai* with finitely many *Ai* different from *X*, or, more easily, from Corollary 8.11. The (pointwise) ergodic theorem gives

$$\lim\_{N \to \infty} S\_{1,N}(f) = E\_{\left(\mathcal{A}\mathbb{P}\_{\mathbb{T}}, \mathbb{V}\_{\omega}\right)}(f\_{\mathbb{I}}), \text{ } \mathbb{V}\_{\infty}\text{-almost surely } (f \in C(X)),\tag{8.92}$$

where <sup>B</sup>*<sup>T</sup>* is the <sup>σ</sup>-algebra within <sup>B</sup><sup>∞</sup> by the *<sup>T</sup>*-invariant sets, and *<sup>f</sup>*<sup>1</sup> <sup>∈</sup>*C*(*X*∞) is still defined by (8.87). Since S<sup>∞</sup> ⊂ B*<sup>S</sup>* and the left-hand side of (8.89) is S∞-measurable (provided it exists, as we have just shown), eq. (8.89) follows from (8.92).

If ν<sup>∞</sup> = μ∞, then the unilateral shift on *X*<sup>∞</sup> is ergodic by Kolmogorov's 0–1 law, and hence the ergodic theorem gives (8.90). Alternatively, if ν<sup>∞</sup> = μ∞, then the random variables (*fN*), defined by (8.87) with *N* = ∞, are i.i.d. (i.e., independent and identically distributed) and (8.90) follows from the strong law of large numbers (which, coherently, in turn may be derived from the ergodic theorem!). -

Note that (8.92) has been proved for *f* ∈ *C*(*X*), but it holds for many other functions, including *f* = 1*A*, where *A* ∈ B. This gives *Borel's law of large numbers*

$$\lim\_{N \to \infty} S\_{\mathbb{I}, N}(1\_A) = \mu(A), \ \mu^{\infty}\text{-almost surely.} \tag{8.93}$$

For example, take *X* = {0,1} (e.g., a coin toss with outcomes 1 = heads and 0 = tails). With *f*(*x*) = *x* in (8.90) or *A* = {1} in (8.93), writing *p* = μ({1}), we obtain

$$\lim\_{N \to \infty} \frac{1}{N} \sum\_{l=1}^{N} \mathbf{x}\_l = p, \ \mu^{\infty}\text{-almost surely on }\underline{\mathbf{2}}^{\mathbb{N}}.\tag{8.94}$$

Equivalently, if *Lp* <sup>⊂</sup> <sup>2</sup><sup>N</sup> is the set of infinite binary sequences *<sup>x</sup>*1*x*<sup>2</sup> ··· for which the limit in (8.94) exists and equals *p*, then μ∞(*Lp*) = 1, cf. (8.91).

#### 8.4 Frequency interpretation of probability and Born rule

Results like (8.90), (8.93), and (8.94) give a relationship between the single-case probabilities μ(*A*) or *p* and the limits of long series of trials on samples drawn according to μ or *p*. Despite the seemingly comforting appearance of *N* < ∞ on the left-hand side, this relationship depends in an essential way on the infinite idealization *X*∞, which is strictly necessary in order to be able to say that the limit (8.94) holds almost surely relative to the measure μ∞. This violates Earman's Principle (cf. the Introduction), which is the reason why we prefer the limit (8.49) over (8.93).

Although these results are mathematically equivalent, both formalizing the idea that if (*x*1,..., *xN*) are sampled from *X* according to some probability measure μ, then (1/*N*)∑*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> *f*(*xi*) converges to *<sup>X</sup> d*μ *f* as *N* → ∞, in (8.49) we never need to work with the "actual infinity" *N* = ∞ and (8.49) holds everywhere on Pr(*X*) rather than almost everywhere on *X*∞. One reason for this is that in (8.93) etc. the choice of the sampling measure μ has to be made at the beginning, whereas in (8.49) it only comes in at the very end. But it has to made either way, and similarly for any other serious effort to relate probability to frequencies in long runs of measurements.

The extreme delicacy of such efforts is clear from the fact that limiting results like (8.90), (8.93), and (8.94) are insensitive to any finite part of the sum, whereas any practical use of probability only involves finite trials. As Lord Keynes once said:

'In the long run we are all dead.'

The founder of the mathematical theory of probability expressed himself likewise:

'The frequency concept based on the notion of limiting frequency as the number of trials increased to infinity, does not contribute anything to substantiate the applicability of the results of probability theory to real practical problems where we have always to deal with a finite number of trials.' (Kolmogorov).

Moreover, a *definition* of probability based on e.g. (8.93) is well known to be circular: although superficially the "almost sure" terminology in the statement of the result might instill confidence in the reader, in fact it is an exceptionally strong constraint on the sequences (*xn*) <sup>∈</sup> *<sup>X</sup>*<sup>∞</sup> in question that the limit should exist *and* has the right value μ(*A*), i.e., that (*x*) ∈ *L*<sup>μ</sup> , cf. (8.91), and we see that this constraint can only be formulated if the single-case probability μ was already defined in the first place. This shows that the link between probability and frequencies of outcomes of long runs of trials only exists and makes sense if single-case probabilities are prior.

On the other hand, if single-case probabilities are "objective", as those provided by the Born measure in quantum mechanics ought to be at least in remotely realistic interpretations of the theory (as opposed to "personal" or "subjective" probabilities construed as "degrees of belief" or "rationality constraints" or whatever other decision-theoretic concept in human psychology), then it is hard to say what they really mean, since it is precisely about single cases that they do not seem to say anything. This brings us to what we propose to call the *Paradox of Probability*:

*Although single-case probabilities must be logically prior to probabilities construed as frequencies, the numerical values of the former have no bearing on single trials and can only be validated through their predictions about (finite) frequencies.*

This paradox imposes the following consistency requirement (which philosophers may want to compare with Lewis's "Principal Principle" that regulates credences):

*The assumption that a single-case probability measure be* μ *must imply that the probabilities for the various outcomes of long runs of repetitions of identical experiments (provided these are possible) are distributed according to* μ*.*

This describes the relationship between theoretical and experimental physics quite well, but still leaves us in the dark as to the meaning of single-case probabilities!

We are now ready to revisit the Born rule, which we already discussed from a purely mathematical point of view in §§§2.1, 2.5, and 4.1. To repeat the main point, if *a* = *a*<sup>∗</sup> ∈ *B*(*H*) is a bounded self-adjoint operator on a Hilbert space, with spectrum σ(*a*), then any state ω on *B*(*H*) defines a unique probability measure μω on σ(*a*) ⊂ R, called the *Born measure*, such that

$$\mathfrak{so}(f(a)) = \int\_{\sigma(a)} d\mu\_{\mathfrak{so}} f, \ f \in \mathcal{C}(\sigma(a)), \tag{8.95}$$

where *f*(*a*) ∈ *C*∗(*a*) ⊂ *B*(*H*) is defined through the continuous functional calculus (Theorem 4.3). For example, for *f* = idσ(*a*), i.e., the function *x* → *x*, eq. (8.95) yields

$$
\mathfrak{so}(a) = \int\_{\sigma(a)} d\mu\_{\mathfrak{so}}(\lambda) \,\lambda \,. \tag{8.96}
$$

The point of this construction of the Born measure is that it is obtained by simply restricting the state ω, initially defined on *B*(*H*), to its commutative C\*-subalgebra *C*∗(*a*). If, in the spirit of (exact) Bohrification, such commutative algebras are identified with corners of classical physics within quantum theory, one may argue that Heisenberg gave the right picture of the origin of probability in quantum mechanics:

'One may call these uncertainties objective, in that they are simply a consequence of the fact that we describe the experiment in terms of classical physics; they do not depend in detail on the observer. One may call them subjective, in that they reflect our incomplete knowledge of the world.' (Heisenberg, 1958, pp. 53–54)

See, however, §11.1. In any case, there are extensions of this construction to unbounded self-adjoint operators as well as to families of commuting self-adjoint operators, to which the following discussion applies, too, *mutatis mutandis*.

The *Born rule* relates the Born measure for *a* to measurements of *a* and as such is responsible for most predictions of quantum physics, especially in quantum field theory, where the connection between theory and experiment mainly involves the measurement of cross-sections computed from the Born measure via Feynman rules. The Born rule and the Heisenberg uncertainty relations are often seen as a turning point where indeterminism entered fundamental physics. Nonetheless, it is hard to say what this Born rule actually states! We made a first attempt in §4.1:

*If an observable a is measured in a state* ω*, then the probability P*ω(*a* ∈ *A*) *that the outcome lies in some measurable subset A* ⊆ σ(*a*) ⊂ R *is given by*

$$P\_{\mathfrak{o}\mathfrak{o}}(a \in A) = \mu\_{\mathfrak{o}\mathfrak{o}}(A). \tag{8.97}$$

Two questions immediately arise:


Perhaps these are even the main questions in the foundations of quantum mechanics. The first will be taken up in Chapter 11; for now, we simply assume that measurements of quantum-mechanical observables *a* are defined and have outcomes in σ(*a*). The second has just been answered (or some might say evaded): through the Born measure, the formalism of quantum mechanics provides numerical values of μω(*A*), whose *mathematical* meaning seems unquestionable, and whose *operational* meaning is given by the predictions they give for outcomes of long runs of repetitions of identical experiments. Therefore, all that remains to be done is derive these predictions by analogy with the results in §8.3 for the commutative C\*-algebra *C*(*X*).

One such attempt is—in its strengths and its weaknesses—quite analogous to the Borel's law of large numbers (8.93). Although we will soon move to *B* = *B*(*H*), the following result is valid for any unital C\*-algebra *B*, with infinite tensor product *B*<sup>∞</sup> as defined in §C.14 and recalled at the end of §8.2, including the map <sup>ϕ</sup>*<sup>M</sup>* : *BM* <sup>→</sup> *<sup>B</sup>*∞.

Proposition 8.14. *If* <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*B*)*, there is a unique state* <sup>ω</sup><sup>∞</sup> *on B*<sup>∞</sup> *such that*

$$\mathfrak{so}^{\bullet}(\mathfrak{q}\_{\mathcal{M}}(b\_1 \otimes \cdots \otimes b\_M)) = \prod\_{n=1}^{M} \mathfrak{o}(b\_n), \ M \in \mathbb{N}, b\_1, \dots, b\_M \in B. \tag{8.98}$$

*Moreover,* ω<sup>∞</sup> *is pure iff* ω *is pure.*

This is a special case of Proposition C.105, with *Ci* = *B* and ω*<sup>i</sup>* = ω for all *i* ∈ N.

We now take *B* = *B*(*H*) for some separable Hilbert space *H*, some observable *a* = *a*<sup>∗</sup> ∈ *B*(*H*) with spectrum σ(*a*) ⊂ R, and some unit vector υ ∈ *H*, with associated (normal) pure state ωυ in *B*(*H*) defined by ωυ (*b*) = υ,*b*υ, and Born measure μωυ <sup>≡</sup> μυ on <sup>σ</sup>(*a*). Now take the corresponding pure state <sup>ω</sup><sup>∞</sup> <sup>υ</sup> on *B*(*H*)<sup>∞</sup> and construct the associated GNS-representation πω<sup>∞</sup> <sup>υ</sup> (*B*(*H*)∞). The Hilbert space *H*ω<sup>∞</sup> υ carrying this representation is an example of an *infinite tensor product of Hilbert spaces* in the sense of von Neumann, which may also be defined directly, as follows.

Take sequences (ψ*n*) ≡ (ψ1,ψ2,...) with ψ*<sup>n</sup>* ∈ *H* satisfying the condition

$$\sum\_{n} |\left| \left| \Psi\_{n} \right| - 1 \right| < \infty;\tag{8.99}$$

the rationale behind this condition is that for any sequence (*zn*) of complex numbers, the product ∏*<sup>n</sup> zn* converges *and* has a nonzero limit iff ∑*<sup>n</sup>* |*zn* −1| < ∞, so (8.99) is equivalent to the requirement that ∏*<sup>n</sup>* ψ*n* converges to some nonzero value. Following von Neumann, we now introduce the convention that if, for some sequence (*zn*) of complex numbers, ∏*<sup>n</sup>* |*zn*| converges but ∏*<sup>n</sup> zn* does not, we define the latter to be zero. On this convention, linear and continuous extension of the expression

$$
\langle \langle \Psi\_n \rangle, (\Psi\_n') \rangle = \prod\_n \langle \Psi\_n, \Psi\_n' \rangle\_H,\tag{8.100}
$$

defines an inner product on the finite linear span *H*<sup>∞</sup> <sup>0</sup> of all sequences (ψ*n*) satisfying (8.99); the *complete tensor product H*<sup>∞</sup> is defined as the closure of *H*<sup>∞</sup> <sup>0</sup> in the ensuing norm. However, this is not the Hilbert space of interest, since it is far too large (e.g., it is not separable even if *H* is). To define interesting separable subspaces of *H*∞, we call sequences (ψ*n*) and (ψ *<sup>n</sup>*) that both satisfy (8.99) *equivalent* if

$$\sum\_{n} |\langle \Psi\_{n}, \Psi\_{n}' \rangle - 1| < \infty;\tag{8.101}$$

this turns out to be a *bona fide* equivalence relation. In particular, if (ψ*n*) and (ψ *n*) are *in*equivalent, then (ψ*n*),(ψ *<sup>n</sup>*) = 0. For any unit vector υ ∈ *H*, we now define the *incomplete tensor product H*<sup>∞</sup> <sup>υ</sup> as the closure of the linear span of all sequences (ψ*n*) that satisfy (8.99) and are equivalent to υ<sup>∞</sup> (i.e., the sequence (ψ *<sup>n</sup>*) with ψ *<sup>n</sup>* = υ for each *n*), with inner product borrowed from *H*<sup>∞</sup> (note that von Neumann's terminology "incomplete" is somewhat confusing, since *H*<sup>∞</sup> <sup>υ</sup> is complete as a normed vector space and in particular it is a Hilbert space). By construction, <sup>υ</sup><sup>∞</sup> <sup>∈</sup> *<sup>H</sup>*<sup>∞</sup> <sup>υ</sup> , and it is easy to show that *H*<sup>∞</sup> <sup>υ</sup> is the closed linear span of all sequences (ψ*n*) that differ from υ ∈ *H* in at most finitely many places. We often write ⊗*n*ψ*<sup>n</sup>* or ψ<sup>1</sup> ⊗ψ<sup>2</sup> ⊗··· for (ψ*n*). Furthermore, for any *M* ∈ N, any *b* ∈ *B*(*H*) defines a bounded operator *b* (*M*) <sup>υ</sup> on *<sup>H</sup>*<sup>∞</sup> <sup>υ</sup> by continuous linear extension of

$$b\_{\nu}^{(M)}(\Psi\_1 \otimes \Psi\_2 \otimes \cdots \otimes \Psi\_M \otimes \cdots) = \Psi\_1 \otimes \Psi\_2 \otimes \cdots \otimes b\Psi\_M \otimes \cdots \,. \tag{8.102}$$

This extends to a representation π<sup>∞</sup> <sup>υ</sup> of *B*<sup>∞</sup> on *H*<sup>∞</sup> <sup>υ</sup> , as follows. Define *<sup>b</sup>*(*M*) <sup>∈</sup> *<sup>B</sup>*<sup>∞</sup> by

$$b^{(M)} = \mathfrak{sp}\_{\mathcal{M}}(1\_H \otimes \cdots \otimes 1\_H \otimes b),\tag{8.103}$$

in which 1*<sup>H</sup>* ⊗···⊗ <sup>1</sup>*<sup>H</sup>* <sup>⊗</sup> *<sup>b</sup>* <sup>∈</sup> *BM*, and <sup>ϕ</sup>*<sup>M</sup>* : *BM* <sup>→</sup> *<sup>B</sup>*<sup>∞</sup> was defined after (8.58). In other words, for *<sup>b</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*), the operator *<sup>b</sup>*(*M*) is the element of *<sup>B</sup>*<sup>∞</sup> given by the equivalence class [*a*1/*N*]*<sup>N</sup>* of the sequence (*a*1/*N*)*<sup>N</sup>* with 1*<sup>B</sup>* in every place except *a*1/*<sup>M</sup>* = *b*. We then define π<sup>∞</sup> <sup>υ</sup> (*B*∞) by linear and continuous extension of

$$
\pi\_{\mathfrak{v}}^{\infty}(b\_1^{(M\_1)} \cdots b\_N^{(M\_N)}) = b\_{1\mathfrak{v}}^{(M\_1)} \cdots b\_{N\mathfrak{v}}^{(M\_N)}.\tag{8.104}
$$

Proposition 8.15. *For any unit vector* υ ∈ *H, the* GNS*-representation* πω<sup>∞</sup> <sup>υ</sup> (*B*∞) *on H*ω<sup>∞</sup> <sup>υ</sup> *is unitarily equivalent with* π<sup>∞</sup> <sup>υ</sup> (*B*∞) *on H*<sup>∞</sup> <sup>υ</sup> *, under which equivalence the cyclic vector* Ωω<sup>∞</sup> <sup>υ</sup> ∈ *H*ω<sup>∞</sup> <sup>υ</sup> *corresponds with* <sup>υ</sup><sup>∞</sup> <sup>∈</sup> *<sup>H</sup>*<sup>∞</sup> υ *.*

*Proof.* This is a simple consequence of Proposition C.91 and the equality

$$
\langle a \mathfrak{o}\_{\mathfrak{v}}^{\infty}(a) = \langle \mathfrak{v}^{\infty}, \mathfrak{x}\_{\mathfrak{v}}^{\infty}(a) \mathfrak{v}^{\infty} \rangle\_{H\_{\mathfrak{v}}^{\infty}},\tag{8.105}
$$

initially for *a* = *b*(*M*) , subsequently for *a* = *b* (*M*1) <sup>1</sup> ···*b* (*MN*) *<sup>N</sup>* , and finally, by linearity and continuity, for any *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*∞. -

In view of this, we will henceforth identify the two Hilbert spaces etc., so that:

314 8 Limits: large *N*

$$H\_{0\stackrel{\omega}{\nu}} = H\_{\nu}^{\curvearrowleft};\tag{8.106}$$

$$
\pi\_{\mathfrak{o}\_{\mathfrak{v}}^{\omega}}(b^{(M)}) = b\_{\mathfrak{v}}^{(M)}; \tag{8.107}
$$

$$
\Omega\_{\mathfrak{o}^{\omega}\_{\mathfrak{v}}} = \mathfrak{v}^{\infty}.\tag{8.108}
$$

Recall that P(*H*) is the set of all projections on *H*, seen as a lattice ordered by *e* ≤ *f* iff *e f* = *e*, which is equivalent to *eH* ⊆ *f H*, and coincides with the order in *B*(*H*)sa, cf. Proposition C.170. Also, B is the Boolean lattice of Borel subsets of σ(*a*), ordered by inclusion. For each Borel set *A* ⊂ σ(*A*) we have an associated spectral projection *eA* ∈ P(*H*), and the map *A* → *eA* defined by the Borel functional calculus, i.e., Theorem B.102, is a lattice homomorphism from B to P(*H*). This follows because from the perspective of the Borel functional calculus the map *A* → *eA* is really the map 1*<sup>A</sup>* → *eA*, which is the restriction of a homomorphism between C\*-algebras and hence preserves positivity. Let B<sup>∞</sup> be the Boolean lattice of Borel sets <sup>B</sup><sup>∞</sup> in <sup>σ</sup>(*a*)∞. As above, take some unit vector <sup>υ</sup> <sup>∈</sup> *<sup>H</sup>*, with corresponding vector state ωυ on *B*(*H*) and associated state ω<sup>∞</sup> <sup>υ</sup> on *B*(*H*)<sup>∞</sup> as defined in Proposition 8.14, which in turn defines the GNS-representation πω<sup>∞</sup> <sup>υ</sup> of *B*(*H*)<sup>∞</sup> on the Hilbert space *H*ω<sup>∞</sup> <sup>υ</sup> . The lattice homomorphism *A* → *eA* then extends to a homomorphism

$$e^{\stackrel{\infty}{\nu}} : \mathcal{J}^{\otimes} \to \mathcal{J}^{\emptyset}(H\_{a\nu\_{\nu}^{\omega}}); \tag{8.109}$$

$$A\_1 \times \cdots \times A\_M \times \prod\_{M+1}^{\infty} \sigma(a) \mapsto \pi\_{\mathfrak{osp}}(e\_{A\_1}^{(1)} \cdots e\_{A\_M}^{(M)});\tag{8.110}$$

this defines *e*<sup>∞</sup> on the basis Borel sets in σ(*a*)<sup>∞</sup> and extends to all of B∞. Realizing *H*ω<sup>∞</sup> <sup>υ</sup> as the infinite tensor product *H*<sup>∞</sup> <sup>υ</sup> , cf. (8.106) - (8.108), we rewrite this as

$$e^{\mathfrak{w}}\left(A\_1 \times \cdots \times A\_M \times \prod\_{M+1}^{\infty} \sigma(a)\right) = e^{(1)}\_{A\_1 \mathfrak{w}} \cdots e^{(M)}\_{A\_M \mathfrak{w}}.\tag{8.111}$$

Theorem 8.16. *Let a* = *a*<sup>∗</sup> ∈ *B*(*H*)*, let* μυ *be the Born measure on* σ(*a*) *defined by some unit vector* <sup>υ</sup> <sup>∈</sup> *H, and define e*<sup>∞</sup> *by* (8.111)*. Let* <sup>σ</sup>(*a*)<sup>∞</sup> <sup>υ</sup> *be the set of all points in* σ(*a*)<sup>∞</sup> *for which* (8.92)*, or, equivalently,* (8.93) *holds (with* μ μυ *). Then*

$$e^{\circ\circ}(\sigma(a)^{\circ\circ}\_{\mathfrak{v}}) = 1\_{H\_{a^{\bullet}\_{\mathfrak{v}}}}.\tag{8.112}$$

*Furthermore, if A* ⊆ σ(*a*) *is Borel measurable, then, using the notation* (8.39)*,*

$$\lim\_{N \to \infty} S\_{1,N}(e\_A) = \mu\_\nu(A) \cdot 1\_{H\_{\text{ap}^\omega\_\nu}},\tag{8.113}$$

*in the strong operator topology (i.e., applied to each fixed vector in H*ω<sup>∞</sup> υ *).*

This is the *quantum-mechanical law of strong numbers*, plus its Borel version. In comparison, the strong law of large numbers or Borel's law of large numbers gives

$$
\mu\_\nu^{\circ \circ} (\sigma(a)\_\nu^{\circ \circ}) = 1. \tag{8.114}
$$

*Proof.* For any probability measure μ on any σ-finite compact space *X*, the corresponding probability measure μ<sup>∞</sup> on *X*<sup>∞</sup> is characterized by the property

$$\mu^{\infty}\left(A\_1 \times \dots \times A\_M \times A \times \prod\_{M+2}^{\infty} \sigma(a)\right) = \mu(A)\mu^{\infty}\left(A\_1 \times \dots \times A\_M \times \prod\_{M+1}^{\infty} \sigma(a)\right),$$

for any *<sup>M</sup>* <sup>∈</sup> <sup>N</sup> and Borel sets *Ai* <sup>⊆</sup> *<sup>X</sup>*. The measure <sup>ν</sup> on <sup>σ</sup>(*a*)<sup>∞</sup> defined by

$$\text{ev}\left(A\_1 \times \cdots \times A\_M \times \prod\_{M+1}^{\infty} \sigma(a)\right) = \mathfrak{a}\_\upsilon^\Leftrightarrow \left(e\_{A\_1}^{(1)} \cdots e\_{A\_M}^{(M)}\right) \tag{8.115}$$

satisfies the above property for μ = μυ and hence coincides with μυ . In view of this, eqs. (C.196) and (8.114) give

$$
\langle \Omega\_{\mathfrak{o}\_{\mathsf{U}}^{\omega}}, e^{\operatorname{\bf cs}} (\sigma(a)\_{\mathsf{v}}^{\omega}) \Omega\_{\mathfrak{o}\_{\mathsf{v}}^{\omega}} \rangle = 1. \tag{8.116}
$$

For any projection *e* and any unit vector ψ ∈ *H* in any Hilbert space *H* , the properties ψ , *e* ψ = 1, *e* ψ = 1, and *e* ψ = ψ are equivalent. Therefore,

$$e^{\circ\circ}(\sigma(a)^{\circ\circ}\_{\upsilon})\Omega\_{a\sharp\_{\upsilon}}=\Omega\_{a\flat\_{\upsilon}^{\omega}}.\tag{8.117}$$

Consider a vector <sup>⊗</sup>*n*ψ*<sup>n</sup>* <sup>∈</sup> *<sup>H</sup>*<sup>∞</sup> <sup>υ</sup> , where only ψ1,...,ψ*<sup>K</sup>* possibly differ from υ (*K* < ∞). Noting that by (8.106) - (8.107) the right-hand side of (8.115) may be written as

$$\begin{split} \left< \mathfrak{o}\_{\mathfrak{v}}^{\infty} \left( e\_{A\_{1}}^{(1)} \cdots e\_{A\_{M}}^{(M)} \right) \right> &= \left< \mathfrak{U}\_{\mathfrak{o}\_{\mathfrak{v}}^{\infty}}, \mathfrak{n}\_{\mathfrak{o}\_{\mathfrak{v}}^{\infty}} \left( e\_{A\_{1}}^{(1)} \cdots e\_{A\_{M}}^{(M)} \right) \mathfrak{U}\_{\mathfrak{o}\_{\mathfrak{v}}^{\infty}} \right> \\ &= \left< \mathfrak{v}^{\infty}, (e\_{A\_{1}\mathfrak{v}}^{(1)} \otimes \cdots \otimes e\_{A\_{M}\mathfrak{v}}^{(M)}) \mathfrak{v}^{\infty} \right>, \end{split} \tag{8.118}$$

we modify (8.115) so as to define a new measure ν on σ(*a*)<sup>∞</sup> by

$$\mathbf{v}'\left(A\_1 \times \cdots \times A\_M \times \prod\_{M+1}^{\infty} \sigma(a)\right) = \langle \odot\_n \Psi\_n, (e^{(1)}\_{A\_1 \mathbf{v}} \otimes \cdots \otimes e^{(M)}\_{A\_M \mathbf{v}}) \otimes\_n \Psi\_n \rangle.$$

Generalizing the above case of <sup>μ</sup>∞, the measure <sup>ν</sup> <sup>=</sup> μψ<sup>1</sup> ×···× μψ*<sup>K</sup>* <sup>×</sup> ∏∞ *<sup>K</sup>*+<sup>1</sup> μυ on σ <sup>∞</sup> is characterized by the following two properties:

$$\nu''\left(A\_1 \times \cdots \times A\_K \times \prod\_{K+1}^{\infty} \sigma(a)\right) = \mu\_{\Psi\_1}(A\_1) \cdots \mu\_{\Psi\_K}(A\_K);\tag{8.119}$$

$$\mathbf{v}''\left(A\_1 \times \dots \times A\_M \times A \times \prod\_{M+2}^{\infty} \sigma(a)\right) = \mu\_\upsilon(A)\mathbf{v}''\left(A\_1 \times \dots \times A\_M \times \prod\_{M+1}^{\infty} \sigma(a)\right),\tag{8.120}$$

$$(M > K),\tag{8.120}$$

and hence <sup>ν</sup> <sup>=</sup> <sup>ν</sup>. Therefore, even though <sup>ν</sup> <sup>=</sup> <sup>μ</sup><sup>∞</sup> <sup>υ</sup> , we have ν (σ(*a*)<sup>∞</sup> <sup>υ</sup> ) = 1, since membership of σ(*a*)<sup>∞</sup> <sup>υ</sup> is entirely defined by the tail of the event. Hence we obtain

316 8 Limits: large *N*

$$e^{\underset{a}{\otimes}}(\sigma(a)\_{\mathfrak{v}}^{\circ}) \otimes\_{n} \Psi\_{n} = \otimes\_{n} \Psi\_{n},\tag{8.121}$$

by the same reasoning as for <sup>υ</sup><sup>∞</sup> <sup>≡</sup> Ωω<sup>∞</sup> <sup>υ</sup> . Since the linear span of such vectors is dense in *H*<sup>∞</sup> <sup>υ</sup> ≡ *H*ω<sup>∞</sup> <sup>υ</sup> and the projection *e*∞(σ(*a*)<sup>∞</sup> <sup>υ</sup> ) is bounded, we obtain (8.112).

To derive (8.113), we use the definition of the Born measure μυ to find

$$\left\| \left( \mathcal{S}\_{1,N}(e\_A) - \mu\_\mathfrak{v}(A) \right) \mathfrak{v}^\infty \right\| = \frac{1}{N} (\mu\_\mathfrak{v}(A) - 2\mu\_\mathfrak{v}(A)^2),\tag{8.122}$$

which vanishes as *<sup>N</sup>* <sup>→</sup> <sup>∞</sup>, so that (8.113) holds on <sup>υ</sup>∞. A similar computation proves (8.113) on vectors ⊗*n*ψ*<sup>n</sup>* as above, since the initial *K* terms where possibly ψ*<sup>n</sup>* = υ drop out in the limit *N* → ∞. Thus we have (8.113) on a dense subspace of *H*ω<sup>∞</sup> υ . Since the strong limit operator μυ (*A*)· 1*H*ω<sup>∞</sup> <sup>υ</sup> is bounded, this proves (8.113). -

An alternative argument shows the mere existence of the limit on the left-hand side of (8.113) on the same dense set, upon which the limit operator is seen to commute with all local and hence (by norm-continuity) with all quasi-local operators. Since ωυ is pure, so is ω<sup>∞</sup> <sup>υ</sup> , and hence πω<sup>∞</sup> <sup>υ</sup> is irreducible. Thus the limit is a multiple of the unit, and the coefficient μυ (*A*) then follows from the computation

$$\lim\_{N \to \infty} \langle \mathfrak{v}^{\infty}, S\_{1,N}(e\_A)\mathfrak{v}^{\infty} \rangle = \mu\_{\mathfrak{v}}(A). \tag{8.123}$$

To reduce the level of abstraction and since it is an important case, we now specialize Theorem 8.16 to a two-level system, i.e., *B* = *M*2(C). In other words, we take *H* = C2, and pick a simple observable *a* = diag(1,0) with non-degenerate spectrum σ(*a*) = 2 = {0,1}, so that measurements outcomes are just strings of zero's and one's. Furthermore, we take a unit vector υ = *c*0|0 + *c*1|1, where |0 = (1,0) and <sup>|</sup>1 = (0,1) form the standard basis of <sup>C</sup>2, and <sup>|</sup>*c*0<sup>|</sup> <sup>2</sup> <sup>+</sup> <sup>|</sup>*c*1<sup>|</sup> <sup>2</sup> = 1. We write *p* = |*c*1| 2. The Born measure μυ on <sup>σ</sup>(*a*) = {0,1} is then given by μυ ({1}) = *<sup>p</sup>* and μυ ({0}) = 1− *p*; cf. (2.10) - (2.11). Taking *A* = {1}, we have *eA* = |11|. The Hilbert space (C2)<sup>∞</sup> <sup>υ</sup> is the closure of the finite linear span of vectors of the kind <sup>ψ</sup><sup>1</sup> <sup>⊗</sup>ψ<sup>2</sup> ··· with <sup>ψ</sup>*<sup>n</sup>* <sup>∈</sup> <sup>C</sup><sup>2</sup> and only finitely many <sup>ψ</sup>*<sup>n</sup>* possibly different from <sup>υ</sup>. For *M* ∈ N, the operator|11| (*M*) ) sends such a vector to ψ<sup>1</sup> ⊗ψ<sup>2</sup> ···⊗(|11|ψ*M*)⊗···, with all ψ*<sup>n</sup>* unaffected except for *n* = *M*. Eqs. (8.112) - (8.113) then simply read

$$\left(e^{\infty}(\underline{\mathbf{2}}^{\circ}\_{p}) = \mathbf{1}\_{(\mathbb{C}^{2})^{\omega}\_{p}};\!\right. \tag{8.124}$$

$$\lim\_{N \to \infty} \frac{1}{N} \sum\_{M=1}^{N} \left( |1\rangle\langle 1|^{(M)} \right) = p \cdot 1\_{(\mathbb{C}^2)^{\omega}\_{\mathbb{D}}},\tag{8.125}$$

where 2<sup>∞</sup> *<sup>p</sup>* denotes the set of all infinite binary strings *x*1*x*<sup>2</sup> ··· for which *xi* ∈ 2 and

$$\lim\_{N \to \infty} \frac{1}{N} \sum\_{i=1}^{N} x\_1 = p,\tag{8.126}$$

and once again the limit in (8.125) is meant strongly, i.e., the expression on the left-hand side must be applied to a fixed vector in (C2)<sup>∞</sup> υ .

Theorem 8.16 forms the (mathematical) culmination of attempts that started in 1960s to derive the Born rule from other postulates of quantum mechanics, notably the so-called *eigenvalue-eigenvector link*, according to which a quantummechanical observable has a definite value if and only if the current quantum state is an eigenvector of the associated operator. This link is applied to the state υ<sup>∞</sup> (or to any other state with approximately the same tail) and the operators *e*∞(σ(*a*)<sup>∞</sup> <sup>υ</sup> ) and lim*N*→<sup>∞</sup> *S*1,*N*(*eA*). The idea, then, is that according to (8.112), the property expressed by the projection *e*∞(σ(*a*)<sup>∞</sup> <sup>υ</sup> ) is certain in the state υ<sup>∞</sup> (for qubits this means that any possible infinite string of binary measurement outcomes has average value *p*). This is reinforced by (8.113), which states that the frequency operator for the outcome *A* has a sharp limit equal to μ(*A*) (for qubits, with *A* = {1} this limit is *p*).

However, although the mathematics is suggestive, apart from the fact that the eigenvalue-eigenvector link itself falls prey to Earman's Principle (in that sharp eigenvalues and eigenvectors are an idealization in a world full of continuous spectra), this particular application of the link makes sense only *at N* = ∞. In this respect, eq. (8.124) has the same drawback as the strong law of large numbers (on which its derivation indeed relies), including the fact that attempts to define probabilities through (8.113) or its special case (8.125) are inherently circular. Moreover, υ<sup>∞</sup> fails to be an eigenvector of any finite-*N* approximant to (8.125), and by the same token, the limit operator defined by (8.125) can only be measured via its individual contributions |11| (*M*) , none of which has υ<sup>∞</sup> as an eigenvector; in fact, it can be shown that any joint eigenvector of all projections |11| (*M*) is orthogonal to the entire space (C2)<sup>∞</sup> <sup>υ</sup> with the complete infinite tensor product (C2)∞.

Problems with Earman's Principle are avoided if we use Theorem 8.4 (applied to *B* = *B*(*H*)) rather than Theorem 8.16: the sequence of operators *S*1,*N*(*eA*) forms a continuous section of the continuous bundle of C\*-algebras with fibers (8.50) - (8.51), whose limit at *N* = ∞, in the sense of (8.46) or (C.560), is given by

$$S\_{1,\infty}(e\_A) : \mathcal{O} \mapsto \mathcal{O}(A);\tag{8.127}$$

recall that *S*1,∞(*eA*) ∈ *C*(*S*(*B*(*H*))). In particular, for pure states ω = ωυ we obtain the Born probability μυ (*A*). As we have also seen in the commutative case, this limit avoids infinite idealizations and other problems with the law of large numbers.

From the point of view of (asymptotic) Bohrification, *C*(*S*(*B*(*H*))) provides a classical description of a long run of identical experiments, which becomes increasingly accurate as *N* → ∞; this is the whole point of the limits (8.46) and (C.560). In particular, the unsound eigenvalue-eigenvector link has been replaced by the role of points ω ∈ *S*(*B*(*H*)) as truthmakers, which is uncontroversial in classical physics. If the quantum state in each identical experiment on the given (single) system is ω, then the above derivation shows that in the limit *N* → ∞, this state acquires a classical meaning (which according to Bohr would even be the *only* meaning it has), namely as the point in the "classical phase space" *S*(*B*(*H*)) that gives the relative frequencies of outcomes of the given long runs of identical experiments. Short of deriving the Born rule, this at least provides the reasoning that links the Born measure (which is canonically given by the theory) to experiment.

#### 8.5 Quantum spin systems: Quasi-local C\*-algebras

Beside the Born rule, our second application of the previous formalism is to *quantum spin systems*, especially to spontaneous symmetry breaking (SSB), see Chapter 10. Postponing a conceptual discussion of infinite systems in their role of idealizations of finite systems to the preamble of that chapter, for the moment we just describe infinite quantum spin systems mathematically. As in §C.14, we take a Hilbert space *H*, here assumed *finite-dimensional*, i.e., *H* ∼= C*n*, and use the standard lattice <sup>Z</sup>*<sup>d</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>d</sup>* in dimension *<sup>d</sup>*. For any *finite* subset <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*d*, i.e., <sup>Λ</sup> <sup>∈</sup> <sup>P</sup>*f*(Z*d*), we put

$$H\_{\Lambda} = \otimes\_{\mathbf{x} \in \Lambda} H\_{\mathbf{x}};\tag{8.128}$$

$$A\_{\Lambda} = B(H\_{\Lambda}) \cong \otimes\_{\pi \in \Lambda} B(H\_{\pi}),\tag{8.129}$$

where *Hx* = *H* for each *x* ∈ Λ, cf. (C.297) and (C.303). The symbolic notations

$$A = \otimes\_{\mathbf{x} \in \mathbb{Z}^d} \mathcal{B}(H) = \underline{\lim}\_{\Lambda \in \Lambda} A\_{\Lambda} = \overline{\bigcup\_{\Lambda \in \beta^p\_f(\mathbb{Z}^d)} \|\cdot\|}, \tag{8.130}$$

all come down to the same thing—see §C.14, notably (C.323) and (C.317)—and define a *quasi-local C\*-algebra*. Elements of each *A*<sup>Λ</sup> ⊂ *A* are called *local observables*, those in the closure of their union are referred to as *quasi-local observables*.

Eq. (8.129) defines a map Λ → *A*<sup>Λ</sup> , which has three important properties:

$$A\_{\Lambda^{(1)}} \subseteq A\_{\Lambda^{(2)}} \quad \text{if} \quad \Lambda^{(1)} \subseteq \Lambda^{(2)} \text{ (Isootony)}; \tag{8.131}$$

$$[A\_{\Lambda^{(1)}}, A\_{\Lambda^{(2)}}] = 0 \text{ if } \Lambda^{(1)} \cap \Lambda^{(2)} = \emptyset \text{ (Einstein locality)};\qquad(8.132)$$

$$A\_{\Lambda}^{\prime} = A\_{\Lambda^{\prime}} \text{ (Haag duality)},\tag{8.133}$$

where *A* <sup>Λ</sup> in (8.133) is the commutant of *A*<sup>Λ</sup> within *A*, and, in cute notation, we put <sup>Λ</sup> <sup>=</sup> <sup>Z</sup>*d*\<sup>Λ</sup> (which is infinite), so that the right-hand side of (8.133) denotes

$$A\_{\Lambda'} = \circledast\_{x \in \Lambda'} \mathcal{B}(H) = \overline{\bigcup\_{\Lambda^{(1)} \in \beta^p\_f(\mathbb{Z}^d \backslash \Lambda)} \mathbb{H}\_{\Lambda^{(1)}}}^{\parallel \cdot \parallel},\tag{8.134}$$

which is a C\*-subalgebra of *<sup>A</sup>*. Since <sup>Λ</sup>(2) <sup>⊂</sup> <sup>Z</sup>*d*\Λ(1) whenever <sup>Λ</sup>(1) <sup>∩</sup>Λ(2) <sup>=</sup> 0,/ Haag duality implies Einstein locality (and sharpens it), but it is still worth mentioning these properties separately: although in quantum spin systems (8.133)—and hence (8.132)—holds, Einstein locality is a more fundamental property (e.g. it is also valid in algebraic quantum field theory, where Haag duality may well fail).

We now discuss some C\*-algebraic concepts that will be needed for the analysis of SSB. Through the associated GNS-representation πω : *A* → *B*(*H*ω), any state ω on *A* defines two interesting subalgebras of *B*(*H*ω), which *a priori* may be different:


Recall that the center of a von Neumann algebra *M* ⊂ *B*(*H*) is *M* ∩*M* , and that *M* is called a factor if *<sup>M</sup>* <sup>∩</sup>*<sup>M</sup>* <sup>=</sup> <sup>C</sup>· 1 (cf. §C.21), so *<sup>A</sup><sup>c</sup>* <sup>ω</sup> is the center of the von Neumann algebra πω(*A*). It is easy to show from Einstein locality that *A*<sup>∞</sup> <sup>ω</sup> <sup>⊆</sup> *<sup>A</sup><sup>c</sup>* <sup>ω</sup>. If each local algebra *A*<sup>Λ</sup> is simple, Haag duality yields the opposite inclusion, so in that case,

$$A\_{\alpha}^{\alpha} = A\_{\alpha}^{c}.\tag{8.135}$$

Given (8.129), this applies as long as dim(*H*) < ∞, in which case also *A* is simple.

The algebra at infinity provides a new perspective on the macroscopic observables in §8.2. Averages like |Λ| <sup>−</sup><sup>1</sup> <sup>∑</sup>*x*∈<sup>Λ</sup> *<sup>b</sup>*(*x*), where *<sup>b</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*), do not have a limit in *<sup>A</sup>* as <sup>Λ</sup> <sup>↑</sup> <sup>Z</sup>*d*, but (depending on <sup>ω</sup>) their representatives <sup>|</sup>Λ<sup>|</sup> <sup>−</sup><sup>1</sup> <sup>∑</sup>*x*∈<sup>Λ</sup> πω(*b*(*x*)) may have a weak limit in *B*(*H*ω). If they do, Einstein locality implies that the limit operator lies in algebra at infinity *A*<sup>∞</sup> <sup>ω</sup> (and hence, assuming (8.135), in *A<sup>c</sup>* <sup>ω</sup>). If the algebra of infinity is trivial (i.e. C· 1*H*<sup>ω</sup> ), macroscopic observables are therefore "*c*numbers", i.e., multiples of the unit operator. In particular, they do not fluctuate, which is among the defining properties of *pure* thermodynamic phases. Formally, this idea is captured by the following generalization of the notion of a pure state:

Definition 8.17. *A representation* π(*A*) *is* primary *if* π(*A*) ∩π(*A*) *is trivial. A state* ω ∈ *S*(*A*) *is* primary *if the* GNS*-representation* πω *is primary.*

For compact groups *G* (or rather their group C\*-algebras *C*∗(*G*)), all representations are completely reducible, and a representation is primary iff it is a (possibly infinite) multiple of some irreducible representation. However, this is not the right picture for general groups or C\*-algebras, which requires some discussion. In preparation, we call some representation π (*A*) on a Hilbert space *H* ⊂ *H* a *subrepresentation* of a representation π(*A*) on *H*, written π ⊂ π, if π = π|*<sup>H</sup>* . Subrepresentations π of π correspond to projections *e* ∈ π(*A*) , such that π (*a*) = *e*π(*a*). It follows that π1(*A*) and π2(*A*) have equivalent subrepresentations iff there exists a nonzero partial isometry *w* : *H*<sup>1</sup> → *H*<sup>2</sup> such that *w*π1(*a*) = π2(*a*)*w* for all *a* ∈ *A*.

Definition 8.18. *Two representations* π<sup>1</sup> *and* π<sup>2</sup> *of a C\*-algebra A are called:*


*We say that two states* ω<sup>1</sup> *and* ω<sup>2</sup> *on A equivalent, disjoint, or quasi-equivalent if the corresponding* GNS*-representations* πω<sup>1</sup> *and* πω<sup>2</sup> *have the said property.*

In other words, π<sup>1</sup> and π<sup>2</sup> are quasi-equivalent iff π<sup>1</sup> has no subrepresentations disjoint from π2, and *vice versa*. This, in turn, is equivalent to the property that the set of π*i-normal states* on *A*, i.e. states of the form *a* → Tr(ρπ*i*(*a*)) with ρ ∈ D(*Hi*), is the same for *i* = 1 as it is for *i* = 2. Contrapositively, π<sup>1</sup> and π<sup>2</sup> are disjoint iff no state exists that is both π1-normal and π2-normal. For example, taking *A* = *C*(*X*), in which case states are probability measures μ on *X*, equivalence and disjointness of states recovers the usual notions of equivalence and disjointness of measures, respectively (i.e., having the same null sets and having disjoint supports).

Proposition 8.19. *For any state* ω*, if* ω = *t*ω<sup>1</sup> + (1−*t*)ω<sup>2</sup> *for some t* ∈ (0,1)*, then* <sup>ω</sup><sup>1</sup> *and* <sup>ω</sup><sup>2</sup> *are disjoint iff there is a projection e* <sup>∈</sup> *Ac* <sup>ω</sup> = πω(*A*) ∩πω(*A*) *such that*

$$
\pi\_{\mathfrak{o}\mathfrak{o}}(A)\_{|eH\_{\mathfrak{o}}} \cong \pi\_{\mathfrak{o}\mathfrak{o}\_{\mathbb{I}}}(A);\tag{8.136}
$$

$$
\pi\_{\mathfrak{w}}(A)\_{|e^\perp H\_{\mathfrak{w}}} \cong \pi\_{\mathfrak{w}\_2}(A). \tag{8.137}
$$

Since subrepresentations of πω(*A*) always correspond to projections *e* ∈ πω(*A*) ; the key assumption being made here is that *e* also lies in the weak closure πω(*A*).

*Proof.* One direction is easy: if (8.136) - (8.137) hold, then (arguing by contradiction) equivalent subrepresentations π1(*A*) of πω<sup>1</sup> (*A*) and π2(*A*) of πω<sup>2</sup> (*A*) are given by projections *e*<sup>1</sup> ≤ *e* and *e*<sup>2</sup> ≤ *e*<sup>⊥</sup> = 1*H*<sup>ω</sup> −*e* , respectively, through

$$
\pi\_i(a) = \pi\_{\mathfrak{w}}(a)\_{|e\_i H\_{\mathfrak{w}}}, \ (i = 1, 2, a \in A), \tag{8.138}
$$

and the partial isometry *w* on *H*<sup>ω</sup> whose restriction to *e*1*H*<sup>ω</sup> implements a (unitary) equivalence between π1(*A*) and π2(*A*) by definition satisfies *w*∗*w* = *e*1, *ww*<sup>∗</sup> = *e*2. Moreover, *e*<sup>1</sup> ≤ *e* implies *we* = *w* and *e*<sup>2</sup> ≤ *e*<sup>⊥</sup> implies *e*⊥*w* = *w*, which together give *e*⊥*we* = *w*. Furthermore, again by definition, *w* ∈ πω(*A*) . If now *e* ∈ πω(*A*), then *we* = *ew*. Combining these equalities gives *w* = 0, which is the desired contradiction.

Lemma 8.20. *For any functional* ω ∈ *A*<sup>∗</sup> *such that* 0 ≤ ω ≤ ω*, where* ω ∈ *S*(*A*)*, there is an operator c* ∈ πω(*A*) *on H*<sup>ω</sup> *such that* 0 ≤ *c* ≤ 1*<sup>H</sup> and*

$$
\Delta \sigma'(a) = \langle \mathfrak{Q}\_{ao}, c\pi\_{ao}(a)\mathfrak{Q}\_{oo} \rangle \ (a \in A). \tag{8.139}
$$

*In particular, there is a vector* ξ ∈ *H*<sup>ω</sup> *such that*

$$
\alpha'(a) = \langle \mathfrak{E}, \pi\_{\mathfrak{o}\mathfrak{o}}(a)\mathfrak{E} \rangle\_{\mathfrak{H}\_{\mathfrak{o}}}.\tag{8.140}
$$

*Proof.* Cauchy–Schwarz for the positive semidefinite form *a*,*b* = ω (*a*∗*b*) gives

$$|a\theta(a^\*b)|^2 \le a\theta(a^\*a)a\theta(b^\*b) \le a\theta(a^\*a)a\theta(b^\*b) = ||\pi\_{\mathfrak{a}\mathfrak{l}}(a)\Omega\_{\mathfrak{a}\mathfrak{l}}||^2||\pi\_{\mathfrak{a}\mathfrak{l}}(b)\Omega\_{\mathfrak{a}\mathfrak{l}}||^2.$$

Hence we obtain a well-defined positive quadratic form *B* on *H*ω, initially defined on the dense domain πω(*A*)Ωω ×πω(*A*)Ωω by the formula

$$B(\pi\_{\mathfrak{w}}(a)\Omega\_{\mathfrak{w}}, \pi\_{\mathfrak{w}}(b)\Omega\_{\mathfrak{w}}) = \mathfrak{w}'(a^\*b),\tag{8.141}$$

and extended to *H*<sup>ω</sup> × *H*<sup>ω</sup> by continuity; the above inequality immediately gives |*B*(ϕ,ψ)|≤ϕψ, and hence Proposition B.79 yields an operator 0 ≤ *c* ≤ 1*<sup>H</sup>* such that *B*(ϕ,ψ) = ϕ, *c*ψ. With (8.141), this gives (8.139). We now compute

$$\begin{split} \mathcal{O}(a^\*b^\*d) &= B(\pi\_{\mathfrak{o}}(ba)\mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(d)\mathfrak{Q}\_{\mathfrak{o}}) = \langle \pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(b^\*)c\pi\_{\mathfrak{o}}(d)\mathfrak{Q}\_{\mathfrak{o}} \rangle \\ &= B(\pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(b^\*d)\mathfrak{Q}\_{\mathfrak{o}} = \langle \pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}}, c\pi\_{\mathfrak{o}}(b^\*)\pi\_{\mathfrak{o}}(d)\mathfrak{Q}\_{\mathfrak{o}} \rangle, \end{split}$$

so that [*c*,πω(*b*∗)] = 0 for each *b* ∈ *A*, i.e., *c* ∈ πω(*A*) . Writing *c* = *c*<sup>2</sup> <sup>1</sup> with *c*<sup>∗</sup> <sup>1</sup> = *c*1, and then ξ = *c*1Ωω, completes the proof. - We continue the proof of Proposition 8.19 in the converse direction. Assume

$$
\alpha \mathbf{o} = t \mathbf{o}\_1 + (1 - t) \mathbf{o}\_2 = \mathbf{o}\_1' + \mathbf{o}\_2',\tag{8.142}
$$

with ω = *t*ω<sup>1</sup> and ω = (1−*t*)ω2, so that 0 ≤ ω ≤ ω and 0 ≤ ω ≤ ω. It follows from the first claim in Lemma 8.20 that there is *c* ∈ *B*(*H*ω) as stated such that

$$
\alpha\_1'(a) = \langle \Omega\_{\mathfrak{a}}, c\pi\_{\mathfrak{a}}(a)\Omega\_{\mathfrak{a}}\rangle;\tag{8.143}
$$

$$
\alpha\_2'(a) = \langle \Omega\_{\mathfrak{a}}, (1\_{H\_{\mathfrak{a}}} - c)\pi\_{\mathfrak{a}}(a)\Omega\_{\mathfrak{a}}\rangle,\tag{8.144}
$$

where (8.144) follows from (8.143), (C.196), and ω = ω <sup>1</sup> +ω <sup>2</sup>. Define ω ∈ *A*<sup>∗</sup> by

$$
\mathfrak{so}'(a) = \langle \mathfrak{Q}\_{\mathfrak{so}}, c(1\_{H\_{\mathfrak{so}}} - c)\pi\_{\mathfrak{so}}(a)\mathfrak{Q}\_{\mathfrak{so}}\rangle. \tag{8.145}
$$

We have 0 ≤ ω ≤ ω <sup>1</sup> (since *c*(1*H*<sup>ω</sup> − *c*) ≤ *c*) as well as 0 ≤ ω ≤ ω <sup>2</sup> (since also *c*(1*H*<sup>ω</sup> −*c*) ≤ 1*H*<sup>ω</sup> −*c*). Now assume that ω<sup>1</sup> and ω<sup>2</sup> are disjoint. Applying (8.140) with ω ω*<sup>i</sup>* shows that ω is π1-normal as well as π2-normal, so that it follows from the remarks following Definition 8.18 that ω = 0. Since Ωω is cyclic for πω(*A*) by the GNS-construction, this implies *<sup>c</sup>*(1*H*<sup>ω</sup> <sup>−</sup> *<sup>c</sup>*) = 0, and hence *<sup>c</sup>*<sup>2</sup> <sup>=</sup> *<sup>c</sup>*. Since *<sup>c</sup>* <sup>≥</sup> 0, which implies *c*∗ = *c*, it follows that *c* is a projection, henceforth called *e*. Therefore,

$$\mathfrak{so}\_1(a) = \langle \mathfrak{Q}\_{\mathfrak{so}}, e\pi\_{\mathfrak{so}}(a)\mathfrak{Q}\_{\mathfrak{so}}\rangle / ||e\mathfrak{Q}\_{\mathfrak{so}}||^2;\tag{8.146}$$

$$\mathfrak{so}\_2(a) = \langle \mathfrak{Q}\_{\mathfrak{so}}, e^\perp \pi\_{\mathfrak{so}}(a) \mathfrak{Q}\_{\mathfrak{so}} \rangle / ||e^\perp \mathfrak{Q}\_{\mathfrak{so}}||^2,\tag{8.147}$$

where *<sup>t</sup>* <sup>=</sup> *e*Ωω2. We see from these formulae and Proposition C.91 that πω<sup>1</sup> and πω<sup>2</sup> are equivalent to the restrictions of πω to *eH*<sup>ω</sup> and *e*⊥*H*ω, respectively; under this equivalence, the cyclic vectors Ωω<sup>1</sup> and Ωω<sup>2</sup> correspond with *e*Ωω/*e*Ωω and *e*⊥Ωω/*e*⊥Ωω, respectively. Since *e* ∈ πω(*A*) by Lemma 8.20, it only remains to be shown that *e* ∈ πω(*A*). To this effect, for any *b* ∈ πω(*A*) and ψ ∈ *H*ω, define

$$\mathfrak{o}\prime\prime \in A\prime;$$

$$\mathfrak{o}\prime\prime(a) = \langle e^{\perp}be\Psi, \pi\_{\mathfrak{o}\mathfrak{o}}(a)e^{\perp}be\Psi\rangle. \tag{8.148}$$

Then ω is positive, as well as πω<sup>2</sup> -normal, the latter because of the presence of the projection *<sup>e</sup>*<sup>⊥</sup> and (8.147). But for *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> we have the inequalities

$$0 \le a\theta''(a) \le ||e^\perp b||^2 \langle e\Psi, \mathfrak{m}\_{\mathfrak{o}}(a)e\Psi \rangle,\tag{8.149}$$

so that 0 ≤ ω ≤ ω <sup>1</sup> for the state (assuming *e*ψ is a unit vector)

$$a\mathfrak{o}\_1''(a) = \langle \Psi, e\pi\_{\mathfrak{o}}(a)e\Psi \rangle. \tag{8.150}$$

Since *e*ψ ∈ *eH*ω, the latter state is πω<sup>1</sup> -normal, so that ω <sup>1</sup> is itself πω<sup>1</sup> -normal by Lemma 8.20 (which argument by now should sound familiar). Again invoking disjointness of ω<sup>1</sup> and ω2, it follows that ω = 0, which, since ψ was arbitrary, in turn yields *e*⊥*be* = 0 for any *b* ∈ πω(*A*) . This forces *e* ∈ πω(*A*). - The first of the following corollaries to Proposition 8.19 is *Hepp's Lemma:*

Lemma 8.21. *Let* π : *A* → *B*(*H*) *be a representation of A, and let* ψ1,ψ<sup>2</sup> *be unit vectors in H. Then the vector states* ω*i*(*a*) = ψ*i*,π(*a*)ψ*i (i* = 1,2*) are disjoint iff*

$$
\langle \Psi\_1, \pi(a)\Psi\_2 \rangle = 0 \ (a \in A). \tag{8.151}
$$

*Proof.* Take, for example, ω = <sup>1</sup> <sup>2</sup> (ω<sup>1</sup> +ω2) in Proposition 8.19. -

Corollary 8.22. *1. Two primary states are either disjoint or quasi-equivalent. 2. A state is primary iff it has no convex decomposition into* disjoint *states.*

Recall that a state is pure if it has no nontrivial convex decomposition *whatsoever*. The analogy between pure states and primary states may be completed as follows:


A physical property of primary states is that the corresponding correlation functions have a clustering property of a kind that may even be experimentally accessible:

Theorem 8.23. *A state* ω *on a quasi-local C\*-algebra A* (8.130) *has trivial algebra at infinity, i.e., A*<sup>∞</sup> <sup>ω</sup> = C· 1*, iff it is* clustering*, in the following sense: for each a* ∈ *A and* <sup>ε</sup> <sup>&</sup>gt; <sup>0</sup> *there is a finite* <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup> such that for all b* <sup>∈</sup> *<sup>A</sup>*<sup>Λ</sup> *with b* <sup>=</sup> <sup>1</sup> *one has*

$$|a(ab) - a(a)a(b)| \le \varepsilon. \tag{8.152}$$

*In particular, if* ω *is primary, then it is clustering and hence* (8.152) *holds.*

*Proof.* The complete proof is quite technical, but the main idea is as follows. Choose finite regions Λ*<sup>n</sup>* moving to infinity (i.e., eventually avoiding any given Λ), and pick elements *cn* ∈ *A*Λ*<sup>n</sup>* ), *cn* = 1. The sequence (πω(*cn*)) in *B*(*H*ω) has a weakly convergent subsequence with limit *c* ∈ *B*(*H*ω). This follows from the Banach–Alaoglu Theorem B.48, applied to *B*(*H*ω) seen as the dual space of *B*1(*H*ω)): on the unit ball, the corresponding weak∗-topology on *B*(*H*ω) coincides with the weak operator topology, so that the unit ball in *B*(*H*ω) is weakly compact and the theorem applies.


Hence *<sup>c</sup>* <sup>∈</sup> *<sup>A</sup><sup>c</sup>* <sup>ω</sup>, and by a more refined argument (which is unnecessary if if *A*<sup>∞</sup> <sup>ω</sup> = *Ac* <sup>ω</sup>), even *<sup>c</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>∞</sup> <sup>ω</sup>. So if *A*<sup>∞</sup> <sup>ω</sup> = C· 1 we have *c* = (Ωω, *c*Ωω)· 1. On the other hand,

$$
\langle \mathfrak{Q}\_{\mathfrak{a}o}, c\mathfrak{Q}\_{\mathfrak{a}o} \rangle = \lim\_{n} \langle \mathfrak{Q}\_{\mathfrak{a}o}, \pi\_{\mathfrak{a}o}(c\_n)\mathfrak{Q}\_{\mathfrak{a}o} \rangle = \lim\_{n} \mathfrak{o}(c\_n),
$$

so that we may compute:

$$\lim\_{n} \mathfrak{o}(ac\_n) = \lim\_{n} \langle \mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(a)\pi\_{\mathfrak{o}}(c\_n)\mathfrak{Q}\_{\mathfrak{o}}\rangle = \langle \mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(a)c\mathfrak{Q}\_{\mathfrak{o}}\rangle = \mathfrak{o}(a)\lim\_{n} \mathfrak{o}(c\_n).$$

Thus for any ε > 0 there is an *N* such that |ω(*acn*)−ω(*a*)ω(*cn*)| ≤ ε for all *n* > *N*. To derive (8.152) from this, an easy *reductio ad absurdum* argument suffices.

The converse direction follows from Kaplansky's Density Theorem C.131. -

#### 8.6 Quantum spin systems: Bundles of C\*-algebras

In this section we reformulate the theory of quantum spin systems in the continuous C\*-bundle language of §8.2. First, for each *<sup>N</sup>* <sup>∈</sup> <sup>N</sup> we define <sup>Λ</sup>*<sup>N</sup>* <sup>∈</sup> <sup>P</sup>*f*(Z*d*) by

$$A\_N = \{ \mathbf{x} \in \mathbb{Z}^d \mid ||\mathbf{x}|| \le N \}. \tag{8.153}$$

We then have the following analogue of the continuous bundle of C\*-algebras *A*(*q*) of C\*-algebras of Theorem 8.8. The base space remains *<sup>I</sup>* <sup>=</sup> <sup>1</sup>/N˙ <sup>⊂</sup> [0,1], where <sup>N</sup>˙ <sup>=</sup> {1,2,...,∞} (seen as possible values of 1/*h*¯), and the fibers are given by

$$A\_0 = A = \underline{\lim}\_{N} A\_{\Lambda\_N} = \overline{\bigcup\_{N \in \mathbb{N}} A\_{\Lambda\_N}}\,^\parallel;\tag{8.154}$$

$$A\_{1/N} = A\_{\Lambda \eta} = B(H\_{\Lambda \eta}) \text{ ( $N \in \mathbb{N}$ )},\tag{8.155}$$

cf. (8.128) - (8.130), still assuming dim(*H*) < ∞. As before, the topology of this bundle is defined through its continuous cross-sections (*a*1/*N*)*N*∈N˙ , which are the analogues of the quasi-local sequences of Definition 8.7. Given (8.154) - (8.155), each fiber algebra *<sup>A</sup>*1/*<sup>N</sup>* is a subalgebra of *<sup>A</sup>*0, and some sequence (*a*1/*N*)*N*∈N˙ simply defines a continuous cross-section of the bundle iff within *A* (i.e. in norm) we have

$$\lim\_{N \to \infty} a\_{1/N} = a\_0. \tag{8.156}$$

In other words, a sequence (*a*1/*N*)*N*∈<sup>N</sup> with *a*1/*<sup>N</sup>* ∈ *A*1/*<sup>N</sup>* ⊂ *A* is quasi-local in the sense of Definition 8.7 iff it converges in *A* (i.e., iff it is Cauchy in the norm of *A*).

The continuous bundle of Theorem 8.4 makes equally good sense for quantum spin systems. First, with *B* = *B*(*H*) ∼= *Mn*(C), the fibers are obviously given by

$$A\_0^{(c)} = C(S(B(H)));\tag{8.157}$$

$$A\_{1/N}^{(c)} = \mathcal{B}(H\_{\Lambda\_N}).\tag{8.158}$$

Second, the continuous sections are once again specified via symmetrization maps

$$S\_{\mathcal{M},N}: B(H\_{\Lambda\mathcal{M}}) \to B(H\_{\Lambda\mathcal{N}}),\tag{8.159}$$

defined similarly to (8.39), namely via canonical symmetrizers

$$S\_N: B(H\_{\Lambda\_N}) \to B(H\_{\Lambda\_N}) \tag{8.160}$$

that are defined a la (8.35) - (8.36), where this time the tensor product and ensuing ` permutation in (8.35) are over all sites *x* ∈ Λ*N*. Regarding *a*1/*<sup>M</sup>* ∈ *B*(*H*Λ*<sup>M</sup>* ) as an element *a* <sup>1</sup>/*<sup>M</sup>* of *B*(*H*Λ*<sup>N</sup>* ) via the embedding *A*Λ*<sup>M</sup>* → *A*Λ*<sup>N</sup>* , we finally define *SM*,*<sup>N</sup>* by

$$\mathcal{S}\_{M,N}(a\_{1/M}) = \mathcal{S}\_N(a'\_{1/M}).\tag{8.161}$$

Symmetric and quasi-symmetric sequences may then be defined exactly as in Definitions 8.2 and 8.3; each quasi-symmetric sequence (*a*1/*N*)*N*∈<sup>N</sup> duly has a limit *<sup>a</sup>*<sup>0</sup> <sup>∈</sup> *<sup>A</sup>*(*c*) <sup>0</sup> given by (8.46), where ω*<sup>N</sup>* is defined as in (8.47), once again with a tensor product over all sites *x* ∈ Λ*N*. By definition, the continuous sections of the bundle (8.157) - (8.158) are then given by the quasi-symmetric sequences.

Although the fibers *A* in (8.154) and *C*(*S*(*B*(*H*))) in (8.157) are as wide apart as they could possibly be, they stunningly arise as limit algebras at *h*¯ = 0 (i.e., *N* = ∞ or Λ = Z*d*) for the same fiber algebras (8.155) and (8.158) at *h*¯ > 0 (i.e., *N* < ∞ or <sup>Λ</sup> <sup>∈</sup> <sup>P</sup>*f*(Z*d*)). As in §8.2, the difference lies in the choice of the topology on the bundle, defined via the continuous sections, which in the first case are the quasi-local sequences, and in the second are the quasi-symmetric (i.e., macroscopic) ones.

An interesting connection between these bundles can be obtained via the following concept, which in a way justifies the introduction of the bundles themselves.

Definition 8.24. *A* continuous field of states *on a continuous bundle of C\*-algebras with fibers* (*A*1/*N*)*N*∈N˙ *is a family* (ω1/*N*)*N*∈N˙ *where*

$$\mathbf{o}\_{1/N} \in \mathcal{S}(\mathbf{A}\_{1/N});\tag{8.162}$$

$$\lim\_{N \to \infty} \mathfrak{o}\_{1/N}(a\_{1/N}) = \mathfrak{o}\_{\mathbb{O}}(a\_{\mathbb{O}}),\tag{8.163}$$

*for each continuous cross-sections* (*a*1/*N*)*. In that case, we write*

$$\mathfrak{o}\_0 = \lim\_{N \to \infty} \mathfrak{o}\_{1/N},\tag{8.164}$$

*despite the fact that all states in question may be defined on different C\*-algebras.*

For example, any state ω on *A*<sup>0</sup> = *A* as in (8.154) defines a continuous field:

Proposition 8.25. *For any state* <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*(*A*)*, the set* (ω1/*N*)*N*∈N˙ *of states defined by*

$$
\mathfrak{a}\_0 = \mathfrak{a};\tag{8.165}
$$

$$
\mathfrak{o}\_{1/N} = \mathfrak{o}\_{\mathbb{A}\_{1/N}},\tag{8.166}
$$

*is a continuous field of states on the bundle with fibers* (8.154) *-* (8.155)*.*

*Proof.* We use the notation of Definition 8.7. For local sequences (8.57) we have

$$
\mathfrak{o}\_{1/N}(a\_{1/N}) = \mathfrak{o}(a\_{1/N}) = \mathfrak{o}(a\_{1/M}),
$$

for all *N* ≥ *M*. Since *a*<sup>0</sup> = *a*1/*M*, this equals ω0(*a*0). For quasi-local sequences, *a*<sup>0</sup> is the limit of the sequence (*a*1/*N*) in the norm of *A*, so that ω(*a*1/*N*) → ω(*a*0). -

Definition 8.26. *A state* ω ∈ *S*(*A*) *is* macroscopic *if* lim*N*→<sup>∞</sup> ω(*a*1/*N*) *exists for any (quasi-) symmetric sequence* (*a*1/*N*)*.*

It does not matter whether we put "symmetric" or "quasi-symmetric" here, since existence of the limit for symmetric sequences implies its existence on quasisymmetric sequences. Indeed, using the fact that ω = 1, we may estimate

8.6 Quantum spin systems: Bundles of C\*-algebras 325

$$\begin{split} |\mathfrak{o}(a\_{1/N}) - \mathfrak{o}(a\_{1/M})| &\leq |\mathfrak{o}(\tilde{a}\_{1/N}) - \mathfrak{o}(\tilde{a}\_{1/N})| \\ &+ ||\|a\_{1/N} - \tilde{a}\_{1/N}|| + ||a\_{1/M} - \tilde{a}\_{1/M}||, \end{split} \tag{8.167}$$

for any sequence (*a*˜1/*M*). Using Definition 8.3, and hence taking (*a*˜1/*M*) symmetric, we see that if (ω(*a*˜1/*N*)) is a Cauchy sequence, then so is (ω(*a*1/*N*)).

Proposition 8.27. *A macroscopic state* ω *determines a state* ω(*c*) <sup>0</sup> *on C*(*S*(*B*)) *by*

$$\mathfrak{a}\_0^{(c)}(a\_0) = \lim\_{N \to \infty} \mathfrak{a}(a\_{1/N}),\tag{8.168}$$

*where* (*a*1/*N*) *is any quasi-symmetric sequence with limit a*<sup>0</sup> ∈ *C*(*S*(*B*))*, cf.* (8.46)*.*

*Proof.* First, note that ω(*c*) <sup>0</sup> is independent of the choice of the approximating sequence (*a*1/*N*), since by the same argument as in the proof of Proposition C.126, if *a*1/*<sup>N</sup>* → *a*<sup>0</sup> as well as *a* <sup>1</sup>/*<sup>N</sup>* → *a*0, we have

$$\lim\_{N \to \infty} ||a\_{1/N} - a\_{1/N}'|| = ||a\_0 - a\_0|| = 0,\tag{8.169}$$

and because ω = 1 for any state ω, we also have

$$|\mathfrak{so}(a\_{1/N} - a\_{1/N}')| \le \|a\_{1/N} - a\_{1/N}'\|.\tag{8.170}$$

Eqs. (8.169) - (8.170) obviously imply

$$\lim\_{N \to \infty} \mathcal{O}(a\_{1/N}) = \lim\_{N \to \infty} \mathcal{O}(a\_{1/N}').\tag{8.171}$$

We next show that if *a*1/*<sup>N</sup>* → *a*<sup>0</sup> and *b*1/*<sup>N</sup>* → *b*<sup>0</sup> in the sense of (C.560), then

$$a\_{1/N}b\_{1/N} \to a\_0b\_0.$$

If (*a*1/*N*) is a symmetric sequence a la (8.43), and likewise ` (*b*1/*N*), where we may assume without loss of generality that *M* is the same for both, then

$$a\_0(\mathfrak{p}) = \mathfrak{p}^M(a\_{1/M}),\tag{8.172}$$

where ρ ∈ *S*(*B*), and likewise for *b*0. Using (8.38), we obtain

$$\lim\_{N \to \infty} \mathfrak{p}^N(a\_{1/N}b\_{1/N}) = \mathfrak{p}^M(a\_{1/M})\mathfrak{p}^M(b\_{1/M}) = a\_0(\mathfrak{p})b\_0(\mathfrak{p}) = (a\_0b\_0)(\mathfrak{p}).\quad(8.173)$$

In particular, if *a*1/*<sup>N</sup>* → *a*0, then *a*<sup>∗</sup> <sup>1</sup>/*Na*1/*<sup>N</sup>* → *a*<sup>∗</sup> <sup>0</sup>*a*0. Since ω is a state, it follows that ω(*c*) <sup>0</sup> (*a*<sup>∗</sup> <sup>0</sup>*a*0) <sup>≥</sup> 0, and since also <sup>ω</sup>(*c*) <sup>0</sup> (1*S*(*<sup>B</sup>*)) = 1 (because the sequence with *a*1/*<sup>N</sup>* = 1*H*Λ*<sup>N</sup>* converges to 1*S*(*B*(*H*))), the claim follows for symmetric sequences. For quasi-symmetric sequences (*a*1/*N*) the result follows by approximating (*a*1/*N*) with symmetric sequences (cf. Definition 8.3). -

Each state ω(*c*) <sup>0</sup> <sup>∈</sup> *<sup>S</sup>*(*A*(*c*) <sup>0</sup> ) is represented by a probability measure μ on the state space *S*(*B*(*H*)) of *B*(*H*). We compute this measure if ω ∈ *S*(*A*) is *permutationinvariant* in that each restriction <sup>ω</sup>1/*<sup>N</sup>* <sup>=</sup> <sup>ω</sup>|*B*(*H*Λ*<sup>N</sup>* ) is invariant under the natural action of the permutation group <sup>S</sup>|Λ*N*<sup>|</sup> on *<sup>B</sup>*(*H*Λ*<sup>N</sup>* ) <sup>∼</sup><sup>=</sup> <sup>⊗</sup>*x*∈Λ*<sup>N</sup> <sup>B</sup>*(*H*), where *<sup>N</sup>* <sup>∈</sup> <sup>N</sup> and <sup>|</sup>Λ*N*<sup>|</sup> is the number of points in <sup>Λ</sup>*<sup>N</sup>* (as in the case of *<sup>B</sup>*<sup>∞</sup> in §8.2). It follows from the Quantum De Finetti Theorem 8.9 (and the fact that that the set *S*S<sup>∞</sup> (*A*) of permutation-invariant states on *A* is a so-called *Bauer simplex*) that each permutation-invariant state ω ∈ *S*S<sup>∞</sup> (*A*) takes the form

$$\mathfrak{op} = \int\_{S(\mathcal{B}(H))} d\mu(\mathfrak{p}) \, \mathfrak{p}^{\curvearrowright},\tag{8.174}$$

where μ is some probability measure on *S*(*B*(*H*)), and ρ ∈ *S*(*B*(*H*)); the associated state <sup>ρ</sup><sup>∞</sup> on *<sup>A</sup>* is defined by its values on each *<sup>A</sup>*Λ*<sup>N</sup>* <sup>⊂</sup> *<sup>A</sup>* via the isomorphism

$$A\_{\Lambda\_N} \cong \otimes\_{\mathfrak{x} \in \Lambda\_N} B(H). \tag{8.175}$$

Furthermore, the integral in (8.174) is defined weakly, i.e., for any *a* ∈ *A* the number <sup>ω</sup>(*a*) is obtained by integrating the function <sup>ρ</sup> → <sup>ρ</sup>∞(*a*) on *<sup>S</sup>*(*B*(*H*)) with respect to μ. In particular, ω ∈ ∂*eS*S<sup>∞</sup> (*A*) iff μ is a Dirac measure on *S*(*B*(*H*)).

Proposition 8.28. *Each permutation-invariant state* ω ∈ *S*S<sup>∞</sup> (*A*) *is macroscopic (cf. Definition 8.26), and the probability measure* μ *on S*(*B*(*H*)) *defined by* ω(*c*) 0 *via* (8.168) *coincides with the one appearing in* (8.174)*.*

*Proof.* Let (*a*1/*N*) be a symmetric sequence (the quasi-symmetric case follows from this), so that *a*1/*<sup>N</sup>* = *SM*,*N*(*a*1/*M*) for some *M* whenever *N* > *M*, cf. (8.43). The limit *<sup>a</sup>*<sup>0</sup> <sup>∈</sup> *<sup>C</sup>*(*S*(*B*(*H*))) is given by (8.172), so that state <sup>ω</sup>(*c*) <sup>0</sup> on *C*(*S*(*B*(*H*))) defined by

$$\mathfrak{a}\_0^{(c)}(f) = \int\_{S(B(H))} d\mu(\mathfrak{p}) \, f(\mathfrak{p}) \tag{8.176}$$

satisfies the required condition

$$\lim\_{N \to \infty} \mathfrak{a}\_{1/N}(a\_{1/N}) = \mathfrak{a}\_{1/M}(a\_{1/M}) = \int\_{S(\mathcal{B}(H))} d\mu(\mathfrak{p}) \, \mathfrak{p}^M(a\_{1/M}) = \mathfrak{a}\_0^{(c)}(a\_0). \qquad \square$$

To proceed we make the following technical assumption on ω ∈ *S*(*A*) (which is satisfied in typical physical models): if πω(*a*1/*N*) → 0 weakly in *B*(*H*ω), for some sequence (*a*1/*N*) where *a*1/*<sup>N</sup>* ∈ *A*1/*N*, then πω(*a*1/*N*)Ωω → 0 in *B*(*H*ω) (in norm).

Theorem 8.29. *Assume that the state* ω *in part 1 below (and likewise the states* ω<sup>1</sup> *and* ω<sup>2</sup> *in part 2) satisfies the above technical condition. Then:*


The techniques in the proof below can be used to show that our additional assumption is equivalent to: if (8.178) below holds weakly in *B*(*H*ω), then it also holds strongly. Thus we could have redefined a macroscopic state ω as one for which the strong limit lim*N*→<sup>∞</sup> πω(*a*1/*N*) exists in *B*(*H*ω) (and some authors indeed do so).

*Proof.* We first show that if ω is a primary macroscopic state on *A*, and (*a*1/*N*) is symmetric (from which the quasi-symmetric case duly follows) such that

$$\lim\_{N \to \infty} \mathcal{o}(a\_{1/N}) = \mathcal{a},\tag{8.177}$$

then, in the weak operator topology on the GNS-representation space *B*(*H*ω),

$$\lim\_{N \to \infty} \pi\_{\mathfrak{a}}(a\_{1/N}) = \mathfrak{a} \cdot 1\_{H\_{\mathfrak{a}}}.\tag{8.178}$$

To this end, we first note that *a*1/*N* is uniformly bounded in *N*: if (*a*1/*N*) is symmetric, as in (8.43), then obviously *a*1/*N* = *a*1/*M* for all *N* > *M*, so that if (*a*1/*N*) is merely quasi-symmetric we have *a*1/*N*≤*a*1/*M* + ε for all *N* > *M*, where ε and *M* are the quantities appearing in Definition 8.3. Hence it is enough to establish the weak limit (8.178) between states in a dense set, viz. πω(*b*)Ωω, where *b* ∈ *A*, or even in ∪*NA*<sup>1</sup>/*N*. Furthermore, using the polarization identity (A.5) and (C.8) - (C.9), it is enough to prove that for each *K* ∈ N and *b* ∈ *A*1/*K*, we have

$$\lim\_{N \to \infty} \mathcal{o}(b^\* a\_{1/N} b) = \mathcal{a} \mathcal{o}(b^\* b), \tag{8.179}$$

since by the GNS-construction we obviously have

$$
\langle \pi\_{\mathfrak{o}\mathfrak{o}}(b)\Omega\_{\mathfrak{o}\mathfrak{o}}, \pi\_{\mathfrak{o}\mathfrak{o}}(a\_{1/N})\pi\_{\mathfrak{o}\mathfrak{o}}(b)\Omega\_{\mathfrak{o}\mathfrak{o}}\rangle = \mathfrak{o}(b^\*a\_{1/N}b).\tag{8.180}
$$

Theorem 8.23 implies (or even states) that if ω is primary, for each *b* ∈ *A* and ε > 0 there is *M* ∈ N such that for all *a* ∈ *A* <sup>Λ</sup>*<sup>M</sup>* with *a* = 1, we have

$$|\mathfrak{a}(b^\*ba) - \mathfrak{a}(b^\*b)\mathfrak{a}(a)| \le \mathfrak{e}.\tag{8.181}$$

Assuming *b* ∈ *A*1/*K*, we first note that lim*N*→∞[*a*1/*N*,*b*] = 0 in norm (even though lim*N*→<sup>∞</sup> *a*1/*<sup>N</sup>* does not exist in norm), and secondly that, for any given *M* ∈ N, if *a*˜1/*<sup>N</sup>* is the same as *a*1/*<sup>N</sup>* except that in any term *b*<sup>1</sup> ⊗···⊗*b*|Λ*N*<sup>|</sup> that contributes to *a*1/*<sup>N</sup>* we replace *bi* 1*<sup>H</sup>* whenever *bi* ∈ *A*1/*M*, then

$$\lim\_{N \to \infty} \left\| \tilde{a}\_{1/N} - a\_{1/N} \right\| = 0. \tag{8.182}$$

Given (8.177), these facts with (8.181) immediately give (8.179) and hence (8.178).

According to (8.177) and (8.178), the state ω(*c*) <sup>0</sup> ∈ *S*(*C*(*S*(*B*(*H*)))) is given by

$$\langle a\_0^{(c)}(a\_0) = \lim\_{N \to \infty} \langle \mathfrak{Q}\_{\mathfrak{a}0}, \mathfrak{x}\_{\mathfrak{a}0}(a\_{1/N}) \mathfrak{Q}\_{\mathfrak{a}0} \rangle,\tag{8.183}$$

where *a*1/*<sup>N</sup>* is some symmetric sequence converging to → *a*<sup>0</sup> in the sense of (C.560); as in the proof of Proposition 8.27, the left-hand side is independent of the particular choice of this sequence. The proof of Proposition 8.27 also showed that if *a*1/*<sup>N</sup>* → *a*<sup>0</sup> and *b*1/*<sup>N</sup>* → *b*0, then *a*1/*Nb*<sup>1</sup>/*<sup>N</sup>* → *a*0*b*0, so that

$$\begin{aligned} \langle a\_0^{(c)}(a\_0 b\_0) = \lim\_{N \to \infty} \langle \mathfrak{Q}\_{\mathfrak{w}}, \pi\_{\mathfrak{w}}(a\_{1/N} b\_{1/N}) \mathfrak{Q}\_{\mathfrak{w}} \rangle \\ &= \lim\_{N \to \infty} \langle \mathfrak{Q}\_{\mathfrak{w}}, \pi\_{\mathfrak{w}}(a\_{1/N}) - \mathfrak{a} \cdot 1\_{H\_{\mathfrak{w}}} \rangle \pi\_{\mathfrak{w}}(b\_{1/N}) \mathfrak{Q}\_{\mathfrak{w}} \rangle + \mathfrak{a} \mathfrak{P}\_{\mathfrak{w}} \end{aligned}$$

where α is defined by (8.177), and likewise β. At this point that we need our additional assumption, which, together with uniform boundedness of πω(*a*1/*N*) and hence of πω(*a*1/*N*)Ωω in *N* yields that the first term in the second line is zero. Therefore, ω(*c*) <sup>0</sup> is multiplicative and hence pure (cf. Proposition C.14).

To prove the second claim, first suppose ω<sup>1</sup> and ω<sup>2</sup> are quasi-equivalent. In that case, up to unitary equivalence, either πω<sup>1</sup> is a subrepresentation of πω<sup>2</sup> , or *vice versa*; assume the former. We then have a projection *e* ∈ πω<sup>2</sup> (*A*) such that

$$
\pi\_{\mathfrak{o}\_1}(a) = e \pi\_{\mathfrak{o}\_2}(a), \tag{8.184}
$$

for each *a* ∈ *A*, and since *e* = 1*H*ω<sup>1</sup> by construction, eq. (8.178) gives

$$\lim\_{N \to \infty} \pi\_{\mathfrak{a}\_{\mathbb{I}}}(a\_{\mathbb{I}/N}) = \mathfrak{a}\_{\mathbb{I}} \cdot e;\tag{8.185}$$

$$\lim\_{N \to \infty} \pi\_{\mathfrak{o}\_2}(a\_{1/N}) = \mathfrak{o}\_2 \cdot 1\_{H\_{\mathfrak{o}\_2}}.\tag{8.186}$$

Multiplying both sides of (8.186) with *e* gives α<sup>1</sup> = α2. -

Corollary 8.30. *A permutation-invariant state* ω ∈ *S*S<sup>∞</sup> (*A*) *is primary iff the corresponding measure* μ *in* (8.174) *is a Dirac measure, and it is pure iff the latter is supported by a pure state on B*(*H*)*.*

*Proof.* In the first claim, the inference from "primary" to "Dirac" obviously follows from Theorem 8.29. The converse direction is a consequence of the commutation theorem (C.329) for von Neumann algebras, combined with the fact that each representation of *B*(*H*) for finite-dimensional *H* is primary (which in turn follows from the fact, not proved in this book, that *B*(*H*) has just one irreducible representation, up to equivalence). The second claim follows from Proposition C.105. '&

Finally, *one* macroscopic state generates many others. A *folium* in the state space *S*(*A*) of a C\*-algebra *A* is a convex, norm-closed subspace F of *S*(*A*) with the property that if ω ∈ F and *b* ∈ *A* such that ω(*b*∗*b*) > 0, then the "reduced" state ω*<sup>b</sup>* : *a* → ω(*b*∗*ab*)/ω(*b*∗*b*) must be in F. For example, if π is a representation of *A* on a Hilbert space *H*, then the set of all density matrices on *H* (i.e. the π-normal states on *A*) comprises a folium F<sup>π</sup> . In particular, each state ω on *A* defines a folium F<sup>ω</sup> ≡ Fπω through its GNS-representation πω. It then follows from cyclicity of the GNS-representation that each state in the folium F<sup>ω</sup> of a macroscopic state ω ∈ *S*(*A*) is automatically macroscopic and even has the same limit state ω(*c*) as ω.

Notes 329

#### Notes

## §8.1. Large quantum numbers

Theorem 8.1 has been adapted from Landsman (1998b); the proof relies on Simon (1980), who, generalizing the case of *SU*(2) treated by Lieb (1973), in turn uses the coherent states for Lie groups introduced by Perelomov (1972, 1986). Duffield (1999) gives the details of the method of steepest descent used in proving (8.30). Although this material was inspired by Bohr's Correspondence Principle, at the end of the day the relationship may seem remote.

## §8.2. Large systems

The theory in this section, which elaborates on Landsman (2007), is a reformulation in terms of continuous bundles of C\*-algebras of the formal parts of a series of papers on quantum mean-field systems by Raggio & Werner (1989, 1991), Duffield & Werner (1992a,b,c), and Duffield, Roos, & Werner (1992). These models have their origin in the treatment of the BCS theory of superconductivity due to Bogoliubov (1958) and Haag (1962); for further references see the notes to §10.8.

## §8.3. Quantum de Finetti Theorem

Theorem 8.9 is due to Størmer (1969), whose proof was based on the fact that the S∞-action on *<sup>B</sup>*<sup>∞</sup> is *asymptotically abelian*, in that for any *<sup>a</sup>*,*<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*<sup>∞</sup> one has

$$\inf \{ \| [\alpha\_p(a), a'] \|, p \in \mathfrak{S}\_\infty \} = 0.$$

This implies that *S*S<sup>∞</sup> (*B*∞) is a Choquet simplex, which quickly leads to (8.66). Our proof is taken from Hudson & Moody (1975). See also Caves, Fuchs, & Schack (2002a). Finite-size corrections to Theorem 8.9 are studied e.g. in Konig & Mitchi- ¨ son (2009). Corollary 8.11 is due to Hewitt & Savage (1955), who credit Jules Haag (rather than De Finetti) for the binary case (i.e., *X* = {0,1}). See Kallenberg (2005) for an exhaustive account of such results (in classical probability theory).

Proposition 8.12 is taken from Diaconis & Freedman (1980), who also give finite-size corrections to Corollary 8.11, as follows. Let a permutation-invariant probability measure ν*<sup>N</sup>* on *X<sup>N</sup>* be *K*-exchangeable, so that there is a permutationinvariant probability measure ν*N*+*<sup>K</sup>* on *XN*+*<sup>K</sup>* whose restriction to *X<sup>N</sup>* is ν*N*. Let *PN*+*<sup>K</sup>* be the probability measure on Pr(*X*) defined by ν*N*+*<sup>K</sup>* as in (8.85), i.e., *PN*+*K*(*A*) = ν*N*+*K*(*E*−<sup>1</sup> *<sup>N</sup>*+*K*(*A*)), and finally define

$$\mathbf{v}'\_{N+K} = \int\_{\text{Pr}(\boldsymbol{X})} d\boldsymbol{P}\_{N+K}(\boldsymbol{\mu})\,\mu^{N+K},$$

as in (8.79). Then, in terms of the usual norm on the Banach dual *C*(*XN*)∗,

$$\|\mathbf{v}\_N - \mathbf{v}\_N'\| \le \frac{K(K-1)}{N}.$$

Proposition 8.13 is stated without proof in Kingman (1978). See Mackey (1974) or Gray (2009) for ergodic theory in connection with probability theory.

Of course, there are numerous results in probability theory that do not share the problems of the law of large numbers. For example, in the situation (8.94), for any ε > 0 one has the *Chernoff–Hoeffding bound*

$$\mu^N \left( |\frac{1}{N} \sum\_{i=1}^N x\_i - p| \ge \varepsilon | \right) \le e^{-2N\varepsilon^2},$$

which is superior to the weak law of large numbers, i.e., for every ε > 0,

$$\lim\_{N \to \infty} \mu^N \left( |\frac{1}{N} \sum\_{i=1}^N x\_i - p| \ge \mathfrak{E} | \right) = 0,$$

which from the point of view of Earman's Principle is already a marked conceptual improvement over the strong law (but which is mathematically weaker).

#### §8.4. Frequency interpretation of probability and Born rule

The Kolmogorov quote is from Fine (1973, p. 94), which even 40 years later is still to be recommended as one of the best (technical) book on the foundations of probability theory. See also Hajek & Hitchcock (2016) for a comprehensive recent ´ survey of the philosophy of probability. The Keynes quote is from Hacking (2001, p. 149), which is a very elementary introduction to the foundations of probability At a more advanced level see also Gillies (2000), whilst Howson (1995) is a useful brief survey.

The original version of the *Principal Principle* (Lewis, 1980) equated probability (or chance) as subjective degree of belief (i.e. credence) with objective chance (though in the single case as opposed to relative frequency. Our own version in the main text is meant to clarify the relationship between singe-case probabilities and long run frequencies, both seen as objective.

Attempts to derive the Born rule started with Finkelstein (1965) and were continued e.g. by Hartle (1968), Farhi, Goldstone, & Gutmann (1989), Van Wesep (2006), Aguirre & Tegmark (2011), Moulay (2014), and others, partly based on indubitable mathematical arguments in the spirit of the strong law of large numbers supplied by e.g. Ochs (1977, 1980), Bugajski & Motyka (1981), Pulmannova & Stehlkov ´ a´ (1986). Such attempts (typically presented as claims) provoked valid critiques of the kind mentioned in the main text from e.g. Cassinelli & Sanchez-G ´ omez (1996) and ´ Caves & Schack (2005). For a balanced account see also Cassinelli & Lahti (1989). Infinite tensor products of Hilbert spaces were introduced by von Neumann (1938).

Our approach, which is sympathetic to both sides of the dispute, is a vast expansion of Landsman (2008). The existence of *e*∞ as in (8.109) - (8.110) is based on the same extension argument that proves the Kolmogorov existence theorem for infinite product probabilities, see e.g. Dudley (1989), proof of Theorem 8.2.2, and Van Wesep (2006), who carries out the proof for *X* = {0,1}.

There is also a large (and inconclusive) literature on alleged derivations of the Born rule in the context of the Many-Worlds (i.e. Everettian) Interpretation of quantum mechanics, which may be traced back from Wallace (2012), who supports such derivations, and Dawid & Thebault (2015), who criticize them. ´

#### §8.5.Quantum spin systems: Quasi-local C\*-algebras

Basic references are Ruelle (1969), Israel (1979), Bratteli & Robinson (1987, 1997), and Simon (1993); for macroscopic states see Hepp (1972) and Sewell (2002). Naaijkens (2013) is a useful brief introduction to quantum spin systems.

The proof that Haag duality holds for quantum spin systems is far from trivial: see Simon (1993), Prop. IV.1.6. In the proof of (8.135), simplicity of *A* given simplicity of each *A*<sup>Λ</sup> is easily inferred from the fact that if *I* ⊂ *A* is an ideal, then *I*<sup>Λ</sup> = *I*∩*A*<sup>Λ</sup> is an ideal in *A*<sup>Λ</sup> = *B*(*H*<sup>Λ</sup> ), which must be either zero or *A*<sup>Λ</sup> , both of which contradict non-triviality of *I*. Theorem 8.23 is a famous result due to Lanford & Ruelle (1969), partly anticipated by Powers (1967). For a complete proof see also Simon (1993), Theorem IV.1.4.

## §8.5.Quantum spin systems: Bundles of C\*-algebras

This section was inspired by Landsman (2007), §6, and Gerisch (1993).

Folia of states (in the sense meant here) were introduced by Haag, Kadison, & Kastler (1970), but note that the name "folium" is poorly chosen, since *S*(*A*) is by no means foliated by its folia (for example, a folium may contain subfolia).

## Chapter 9 Symmetry in algebraic quantum theory

In §3.9 we defined symmetries of classical physics as symmetries of either Poisson manifolds or Poisson algebras; these notions are equivalent. At the bare level of the underlying phase space *X*, merely seen as a locally compact space (rather than a Poisson manifold), the key result establishing this equivalence is this:

Theorem 9.1. *Let X and Y be locally compact Hausdorff spaces. Each isomorphism* α :*C*0(*Y*) →*C*0(*X*) *is induced by a homeomorphism* ϕ : *X* →*Y via* α = ϕ<sup>∗</sup> *(and so each automorphism of C*0(*X*) *is induced by a homeomorphism of X ).*

*More generally, if A and B are commutative C\*-algebras, then each isomorphism* α : *A* → *B is induced by a homeomorphism* ϕ : Σ(*B*) → Σ(*A*) *of the corresponding Gelfand spectra via* α = *G*−<sup>1</sup> *<sup>B</sup>* ◦ ϕ<sup>∗</sup> ◦ *GA, where GA* : *A* → *C*0(Σ(*A*)) *is the Gelfand ismomorphism, cf.* (C.79)*, and similarly for B (and so each automorphism of A is induced by a homeomorphism of its Gelfand spectrum* Σ(*A*)*).*

This immediately follows from Theorems C.8 and C.45, and Corollary C.48.

In Chapter 5 we saw that even in elementary quantum mechanics, where *A* = *B*(*H*) for some Hilbert space *H*, the concept of a symmetry is more diverse, as least apparently, since a non-commutative C\*-algebra like *B*(*H*) gives rise to numerous "quantum structures". The ones we looked at were listed after Proposition 5.3, viz.


Each structure comes with its own notion of a symmetry, see Definition 5.1. This raises two questions, which for *B*(*H*) were completely answered in Chapter 5:


Indeed, it was found that if dim(*H*) > 2, then all these notions of symmetry are equivalent, as well as unitarily implementable a la Wigner; see Theorem 5.4. `

#### 9.1 Symmetries of C\*-algebras and Hamhalter's Theorem

In this chapter we generalize this analysis from *A* = *B*(*H*) to arbitrary C\*-algebras *A*, which for simplicity we assume to have a unit 1*A*. See §C.25 for terminology.

Definition 9.2. *Let A be a unital C\*-algebra.*

*1. The* pure state space *P*(*A*) = ∂*eS*(*A*) *is the extreme boundary of the state space S*(*A*)*, seen as a uniform space equipped with a transition probability*

$$\pi(ao, ao') = \inf\{a(a) \mid a \in A, 0 \le a \le 1\_A, ao'(a) = 1\}.\tag{9.1}$$

*A* Wigner symmetry *of A is a uniformly continuous bijection* W : *P*(*A*) → *P*(*A*) *with uniformly continuous inverse that preserves transition probabilities, i.e.,*

$$
\pi(\mathsf{W}(\mathsf{o}\mathsf{o})\mathsf{W}(\mathsf{o}\mathsf{o}')) = \pi(\mathsf{o}\mathsf{o},\mathsf{o}'),\ \mathsf{o}\mathsf{o},\mathsf{o}' \in P(\mathsf{A}).\tag{9.2}
$$

If *A* = *B*(*H*), Proposition C.177 guarantees that the above expression reproduces the standard quantum-mechanical transition probabilities (2.44), but compared to this special case, one novel aspect of *P*(*A*) is that all pure states are now taken into account (as opposed to merely the normal ones, which notion is undefined for general C\*-algebras anyway). Another is that in order to obtain the desired equivalence with other structures, the set *P*(*A*) should carry a uniform structure, namely the *w*∗-uniformity inherited from *A*∗.


The structures 1, 2, 3 (with Jordan symmetries), and 4 are equivalent; see Theorem C.179 for 1 ↔ 2 and Theorem C.172 for 2 ↔ 3; the equivalence 3 ↔ 4 is proved in exactly the same way as in Proposition 5.21, with Lemma 5.20 for the special case *A* = *B*(*H*) replaced by Lemma C.173 (which has the same proof). From 1–4 we pick the Jordan algebra structure of *A*, since it gives the most straightforward results.

Henceforth, *A* and *B* are unital C\*-algebras, and we define a *weak Jordan isomorphism* of *A* and *B* as an invertible map J : *A*sa → *B*sa whose restriction to each subspace *C*sa of *A*sa, where *C* ∈ C (*A*), is linear and preserves the Jordan product ◦ (so that a Jordan symmetry of *A* alone is a weak Jordan automorphism of of *A*). Such a map complexifies to a map J<sup>C</sup> : *A* → *B* in the usual way, i.e. writing *a* ∈ *A* as *a* = *b*+*ic*, with *b*<sup>∗</sup> = *b* and *c*<sup>∗</sup> = *c*, cf. (C.9), and put JC(*a*) = J(*b*) +*i*J(*c*)). If no confusion arises, we just write J for JC. We first turn to Bohr symmetries.

Proposition 9.3. *Given a weak Jordan isomorphism* J : *A*sa → *B*sa*, the ensuing map* B : C (*A*) → C (*B*) *defined by* B(*C*) = JC(*C*) ≡ J(*C*) *is an order isomorphism.*

Note that as an argument of B the symbol *C* is a point in the poset C (*A*), whereas as an argument of J<sup>C</sup> it is a subset of *A*, so that JC(*C*) stands for {JC(*c*) | *c* ∈ *C*}.

*Proof.* The restriction J|*<sup>C</sup>* :*C* → *B* is a homomorphism of C\*-algebras on each *commutative* C\*-algebra *C* ⊂ *A* (although J : *A* → *B* may not be). Since J|*<sup>C</sup>* is injective on *C*sa (where it coincides with J), it is also injective on *C*. Hence J|*<sup>C</sup>* is isometric by Theorem C.62.3, so that its range is closed and therefore J(*C*) is a commutative C\*-algebra in *B*, which is unital if *C* is. Trivially, if *C* ⊆ *D* in *A* (so that *C* ≤ *D* in C (*A*)), then J(*C*) ⊆ J(*D*) in *B* (so that J(*C*) ≤ J(*D*) in C (*B*)). -

The converse, however, is a deep result, which we call *Hamhalter's Theorem*:

Theorem 9.4. *Let A and B be unital C\*-algebras and let* B : C (*A*) → C (*B*) *be an order isomorphism. Then there is a weak Jordan isomorphism* J : *A*sa → *B*sa *such that* B = JC*. Moreover, if A is isomorphic to neither* C<sup>2</sup> *nor M*2(C)*, then* J *is uniquely determined by* B*, so in that case there is a bijective correspondence* J ↔ B *between weak Jordan symmetries* J *of A and Bohr symmetries* B *of A.*

Before proving this, let us explain why C<sup>2</sup> and *M*2(C) are exceptional. In the first case, <sup>C</sup> (C2) <sup>∼</sup><sup>=</sup> {0,1} (with 0 <sup>≡</sup> <sup>C</sup>·12 and 1 <sup>≡</sup> <sup>C</sup>2), which admits just one order isomorphism (viz. the identity map), which is induced by both the map (*a*,*b*) → (*b*,*a*) and by the identity map on C<sup>2</sup> (each of which is a weak Jordan automorphism).

In the second case, the poset C (*M*2(C)) has a bottom element 0 ≡ C · 12, as before, but no top element; each element *C* = C· 12 of C (*M*2(C) is a unitary conjugate of the diagonal subalgebra *D*2(C), with 0 ≤ *C* but no other orderings. Furthermore, *C*∩*C* = C·12 whenever *C* =*C* . Hence any order isomorphism of C (*M*2(C)) maps C · 12 to itself and permutes the *C*'s. Thus each map J : *M*2(C)sa → *M*2(C)sa whose complexification J<sup>C</sup> : *M*2(C) → *M*2(C) shuffles the *C*'s isomorphically (as C\*-algebras) gives a weak Jordan automorphism. For example, take (*a*,*b*) → (*b*,*a*) on *D*2(C) and the identity on each *C* = *D*2(C)); this induces the identity map on C (*M*2(C). It follows that there are vastly more weak Jordan automorphisms of *M*2(C) than there are order isomorphisms of C (*M*2(C)).

*Proof.* The key to the proof lies in the commutative case, which can be reduced to topology. If *A* = *C*(*X*), any *C* ∈ C (*A*) induces an equivalence relation ∼*<sup>C</sup>* on *X* by

$$\mathbf{x} \sim\_{\mathcal{C}} \mathbf{y} \text{ iff } f(\mathbf{x}) = f(\mathbf{y}) \forall f \in \mathcal{C}. \tag{9.3}$$

This, in turn, defines a partition *X* = <sup>λ</sup> *K*<sup>λ</sup> of *X* (henceforth called π), whose blocks *K*<sup>λ</sup> ⊂ *X* are the equivalence classes of ∼*C*. To study a possible inverse of this procedure, for any closed subset *K* ⊂ *X* we define the ideal

$$I\_K = \mathcal{C}(X; K) = \{ f \in \mathcal{C}(X) \mid f(\mathbf{x}) = \mathbf{0} \,\forall \mathbf{x} \in K \},\tag{9.4}$$

in *<sup>C</sup>*(*X*), and its unitization ˙*IK* <sup>=</sup> *IK* <sup>⊕</sup>C·1*<sup>X</sup>* , which evidently consists of all continuous functions on *X* that are constant on *K*. If *X* is finite (and discrete), each partition π of *X* defines some unital C\*-algebra *C* ⊆ *C*(*X*) through

$$C = \bigcap\_{K\_{\lambda} \in \pi} I\_{K\_{\lambda}},\tag{9.5}$$

which consists of all *f* ∈ *C*(*X*) that are constant on each block *K*<sup>λ</sup> of the given partition π. In that case, the correspondence *C* ↔ π, where π is defined by the equivalence relation ∼*<sup>C</sup>* in (9.3), gives a bijection between C (*C*(*X*)) and the set <sup>P</sup>(*X*) of all partitions of *<sup>X</sup>*. For example, the subalgebra *<sup>C</sup>* <sup>=</sup> ˙*IK* corresponds to the partition consisting of *K* and all singletons not lying in *K*. Given the already defined partial order on C (*C*(*X*)) (i.e., *C* ≤ *D* iff *C* ⊆ *D*), we may promote this bijection to an order isomorphism of posets if we define the partial order ≤ on <sup>P</sup>(*X*) to be the *opposite* of the natural one ≤ in which π ≤ π (where π and π consist of blocks {*K*<sup>λ</sup> } and {*K* <sup>λ</sup>}, respectively) iff each *K*<sup>λ</sup> is contained in some *K* <sup>λ</sup> (i.e., π is finer than π ). The partial ordering ≤ makes P(*X*) a complete lattice, whose top element consists of all singletons on *X* and whose bottom element just consists of *X* itself: the former corresponds to *C*(*X*), which is the top element of C (*C*(*X*)), whilst the latter corresponds to C· 1*<sup>X</sup>* , which is the bottom element of C (*C*(*X*)).

For general compact Hausdorff spaces *X*, since *C*(*X*) is sensitive to the topology of *X* the equivalence relation (9.3) does not induce arbitrary partitions of *X*. It turns out that each*C* ∈ C (*C*(*X*)) induces an *upper semicontinuous partition* (abbreviated by *u.s.c. decomposition*) of *X*, i.e.,


This can be seen as follows. Firstly, if we equip π with the quotient topology with respect to the the natural map *q* : *X* → π, *x* → *K*<sup>λ</sup> if *x* ∈ *K*<sup>λ</sup> , then π is compact, for *X* is compact. Moreover, π is Hausdorff. To see this, let *K*<sup>λ</sup> and *K*<sup>μ</sup> be two distinct *points* in π. Recall that *x*, *y* ∈ *K*<sup>λ</sup> if and only if *f*(*x*) = *f*(*y*) for each *f* ∈ *C*. Since *K*<sup>λ</sup> = *K*<sup>μ</sup> , there is some *x* ∈ *K*<sup>λ</sup> , some *y* ∈ *K*<sup>μ</sup> and some *f* ∈*C* such that *f*(*x*) = *f*(*y*), whence there are open disjoint *U*,*V* ⊆ C such that *f*(*x*) ∈ *U* and *f*(*y*) ∈ *V*.

Define <sup>ˆ</sup>*<sup>f</sup>* : <sup>π</sup> <sup>→</sup> <sup>C</sup> by <sup>ˆ</sup>*f*(*K*<sup>λ</sup> ) = *<sup>f</sup>*(*x*) for some *<sup>x</sup>* <sup>∈</sup> *<sup>K</sup>*<sup>λ</sup> . By definition of *<sup>K</sup>*<sup>λ</sup> , this is independent of the choice of *<sup>x</sup>* <sup>∈</sup> *<sup>K</sup>*<sup>λ</sup> , hence <sup>ˆ</sup>*<sup>f</sup>* is well defined. Again by definition, we have *<sup>f</sup>* <sup>=</sup> <sup>ˆ</sup>*<sup>f</sup>* ◦ *<sup>q</sup>*, hence *<sup>q</sup>*−1( <sup>ˆ</sup>*<sup>f</sup>* <sup>−</sup>1)[*U*] = *<sup>f</sup>* <sup>−</sup>1[*U*], which is open in *<sup>X</sup>* since *<sup>f</sup>* is continuous. Since π is equipped with the quotient topology, it follows that ˆ*f* <sup>−</sup>1[*U*] is open in π, and similarly ˆ*f* <sup>−</sup>1[*V*] is open. Moreover, we have ˆ*f*(*K*<sup>λ</sup> ) = *f*(*x*) and *<sup>f</sup>*(*x*) <sup>∈</sup> *<sup>U</sup>*, hence *<sup>K</sup>*<sup>λ</sup> <sup>∈</sup> <sup>ˆ</sup>*<sup>f</sup>* <sup>−</sup>1[*U*], and similarly, *<sup>K</sup>*<sup>μ</sup> <sup>∈</sup> <sup>ˆ</sup>*<sup>f</sup>* <sup>−</sup>1[*V*]. We conclude that <sup>π</sup> is also Hausdorff. Since *q* is a continuous map between compact Hausdorff spaces, it follows that *q* is closed. It is a standard result in topology that *q* is closed iff π is a u.s.c. decomposition, so we have now proved the latter.

Consequently, by the same maps (9.3) and (9.5), the poset C (*C*(*X*)) is antiisomorphic to the poset F(*X*) of all u.s.c. decompositions of *<sup>X</sup>* in the natural ordering <sup>≤</sup> (which proves that F(*X*) is a complete lattice, since <sup>C</sup> (*C*(*X*)) is). This is still a complicated poset; assuming *X* to be larger than a singleton, the next step is to identify the simpler poset F2(*X*) of all closed subsets of *X* containing at least two elements within F(*X*), where (as above) we identify a closed *<sup>K</sup>* <sup>⊆</sup> *<sup>X</sup>* with the (u.s.c.) partition π*<sup>K</sup>* of *X* whose blocks are *K* and all singletons not lying in *K* (note that the poset F(*X*) of all closed subsets of *X* is less useful, since any singleton in F(*X*) gives rise to the bottom element of F(*X*)). To do so, we first recall that <sup>β</sup> is said to *cover* α in some poset if α < β, and α ≤ γ < β implies α = γ. If the poset has a bottom element, then its covers are precisely its *atoms*. Furthermore, note that since the bottom element 0 of F(*X*) consists of singletons, the atoms in F(*X*) are the partitions of the form <sup>π</sup>{*x*1,*x*2} (where *<sup>x</sup>*<sup>1</sup> <sup>=</sup> *<sup>x</sup>*2). It follows that some partition <sup>π</sup> <sup>∈</sup> <sup>F</sup>(*X*) lies in <sup>F</sup>2(*X*) <sup>⊂</sup> <sup>F</sup>(*X*) iff exactly one of the following conditions holds:


In order to see that π satisfying the third condition must be of the form π*K*, assume the converse. So π contains two blocks *K*<sup>λ</sup> and *K*<sup>μ</sup> consisting of two or more elements. Say {*x*1, *x*2} ⊆ *K*<sup>λ</sup> and {*x*3, *x*4} ⊆ *K*<sup>μ</sup> . Then α = π{*x*1,*x*2} and β{*x*3,*x*4} are atoms such that α,β < π, and there is an atom γ = π{*x*5,*x*6} ≤ π such that there are three atoms covered by α ∨γ, and there are three atoms covered by β ∨γ. It follows from the second condition that α ∨γ = π*<sup>L</sup>* with *L* a three-point set. This implies that {*x*1, *x*2}∩{*x*5, *x*6} is not empty, from which it follows that α ∨γ = π{*x*1,*x*2,*x*5,*x*6}. Similarly, we find β ∨γ = π{*x*3,*x*4,*x*5,*x*6}. Since {*x*1, *x*2, *x*5, *x*6} and {*x*3, *x*4, *x*5, *x*6} overlap, we obtain α ∨β ∨γ = π{*x*1,*x*2,*x*3,*x*4,*x*5,*x*6}. Moreover, α,β, γ ≤ π, so α ∨β ∨γ ≤ π. However, since *x*1, *x*<sup>2</sup> ∈ *K*<sup>λ</sup> , we must have {*x*1, *x*2, *x*3, *x*4, *x*5, *x*6} ⊆ *K*<sup>λ</sup> by definition of the order on <sup>F</sup>(*X*). But since *<sup>x</sup>*3, *<sup>x</sup>*<sup>4</sup> <sup>∈</sup> *<sup>K</sup>*<sup>μ</sup> , we must also have {*x*1, *<sup>x</sup>*2, *<sup>x</sup>*3, *<sup>x</sup>*4, *<sup>x</sup>*5, *<sup>x</sup>*6} ⊆ *K*<sup>μ</sup> , which is not possible, since *K*<sup>λ</sup> and *K*<sup>μ</sup> are distinct blocks, hence disjoint. We conclude that π can have only one block *K* of two or more elements, hence π = π*K*.

Thus <sup>F</sup>2(*X*) <sup>⊂</sup> <sup>F</sup>(*X*) has been characterized order-theoretically. Moreover,

$$
\mathfrak{x} = \vee\_{\mathfrak{x} \in X} \mathfrak{x}\_{\mathbf{K}(\mathfrak{x})},\tag{9.6}
$$

where *<sup>K</sup>*(*x*) is the unique block of *<sup>X</sup>* that contains *<sup>x</sup>*. Hence <sup>F</sup>2(*X*) determines F(*X*).

Let *X* and *Y* be compact Hausdorff spaces of cardinality at least two (so that the empty set and singletons are excluded). By the previous analysis, an order isomorphism B : C (*C*(*X*)) → C (*C*(*Y*)) is equivalent to an order isomorphism <sup>F</sup>(*X*) <sup>→</sup> <sup>F</sup>(*Y*), which in turn restricts to an order isomorphism <sup>F</sup>2(*X*) <sup>→</sup> <sup>F</sup>2(*Y*).

Lemma 9.5. *If X and Y are compact Hausdorff spaces of cardinality at least two, then any order isomorphism* F : F2(*X*) → F2(*Y*) *is induced by a homeomorphism* ϕ : *X* → *Y via* F(*F*) = ϕ(*F*)*, i.e.,* F(*F*) = ∪*x*∈*F*{ϕ(*x*)}*. Moreover, if X and Y have cardinality at least three, then* ϕ *is uniquely determined by* F*.*

To see the idea, we first prove this for finite *X*, where F2(*X*) simply consists of all subsets of *X* having at least two elements, etc. It is easy to see that *X* and *Y* must have the same cardinality |*X*| = |*Y*| = *n*. If *n* = 2, then F2(*X*) = *X* etc., so there is only one map F, which is induced by each of the two possible maps ϕ : *X* → *Y*, so that ϕ exists but fails to be unique. If *n* > 2, then F must map each subset of *X* with *n*−1 elements to some subset of *Y* with *n*−1 elements, so that taking complements we obtain a unique bijection ϕ : *X* → *Y*. To show that ϕ induces F, note that the meet ∧ in F2(*X*) is simply intersection ∩, and also that for any *F* ∈ F2(*X*),

$$F = \cup\_{\mathbf{x} \in F} \{ \mathbf{x} \} = \cap\_{\mathbf{x} \notin F} \{ \mathbf{x} \}^c = (\cup\_{\mathbf{x} \notin F} \{ \mathbf{x} \})^c,\tag{9.7}$$

where *Ac* <sup>=</sup> *<sup>X</sup>*\*A*. Since <sup>F</sup> is an order isomorphism, it preserves <sup>∧</sup> <sup>=</sup> <sup>∩</sup>, so that

$$\mathsf{F}(F) = \cap\_{\mathfrak{x}\not\in F} \mathsf{F}(\{\mathsf{x}\}^c) = \cap\_{\mathfrak{x}\not\in F} X \backslash \{\mathfrak{p}(\mathsf{x})\} = (\cup\_{\mathfrak{x}\not\in F} \{\mathfrak{p}(\mathsf{x})\})^c = \cup\_{\mathfrak{x}\in F} \{\mathfrak{p}(\mathsf{x})\}.\tag{9.8}$$

Now assume that *X* is infinite. Let *x* ∈ *X*. If *x* is not isolated, we define ϕ(*x*) as follows. Let O(*x*) denote the set of all open neighborhoods of *x*. Since *x* is not isolated, each *O* ∈ O(*x*) contains at least another element, so *O* ∈ F2(*X*). Moreover, finite intersections of elements of {*O* : *O* ∈ O(*x*)} are still in F2(*X*). Indeed, if *O*1,...,*On* ∈ O(*x*), then *O*<sup>1</sup> ∩ ... ∩ *On* is an open set containing *x*, and since *O*<sup>1</sup> ∩...∩*On* ⊆ *O*<sup>1</sup> ∩ ... ∩ *On*, it follows that *O*<sup>1</sup> ∩ ... ∩ *On* ∈ F2(*X*). Since F is an order isomorphism, we find that finite intersections of {F(*O*) : *O* ∈ O(*x*)} are contained in F2(*Y*). This implies that {F(*O*) : *O* ∈ O(*x*)} satisfies the finite intersection property. As *Y* is compact, it follows that *Ix* = \* *<sup>O</sup>*∈O(*x*) F(*O*) is nonempty. We can say more: it turns out that *Ix* contains exactly one element. Indeed, assume that there are two different points *y*1, *y*<sup>2</sup> ∈ *Ix*. Then {*y*1, *y*2} ∈ F2(*Y*), so <sup>F</sup>−1({*y*1, *<sup>y</sup>*2}) <sup>∈</sup> <sup>F</sup>2(*X*). Since {*y*1, *<sup>y</sup>*2} ∈ <sup>F</sup>(*O*) for each *<sup>O</sup>* <sup>∈</sup> <sup>O</sup>(*x*), we also find that <sup>F</sup>−1({*y*1, *<sup>y</sup>*2}) <sup>⊆</sup> *<sup>O</sup>* for each *<sup>O</sup>* <sup>∈</sup> <sup>O</sup>(*x*). This implies that

$$\mathsf{F}^{-1}(\{\mathsf{y}\_{1},\mathsf{y}\_{2}\}) \subseteq \bigcap\_{\mathcal{O} \in \mathcal{O}(\mathfrak{x})} \overline{\mathcal{O}} = \{\mathsf{x}\},\tag{9.9}$$

where the last equality holds by normality of *X*. But this is a contradiction with F : F2(*X*) → F2(*Y*) being a bijection. So *Ix* contains exactly one point. We define ϕ(*x*) such that {ϕ(*x*)} = *Ix*. Notice that ϕ(*x*) cannot be isolated in *Y*, since if we assume otherwise, then *<sup>Y</sup>* \ {ϕ(*x*)} must be a co-atom in <sup>F</sup>2(*Y*), whence <sup>F</sup>−1(*<sup>Y</sup>* \ {ϕ(*x*)}) is a co-atom in F2(*X*), which must be of the form *X* \ {*z*} for some isolated *z* ∈ *X*. Since *x* is not isolated, we cannot have *x* = *z*, so *X* \ {*z*} is an open neighborhood of *x*, which is even clopen since *z* is isolated. By definition of ϕ(*x*), we must have ϕ(*x*) ∈ F(*X* \ {*z*}), but F(*X* \ {*z*}) = *Y* \ {ϕ(*x*)}. We found a contradiction, hence ϕ(*x*) cannot be isolated. Now assume that *x* is an isolated point. Then *X* \{*x*} is a coatom in F2(*X*), so F(*X* \ {*x*}) is a co-atom in F2(*Y*), too. Clearly this implies that F(*X* \ {*x*}) = *Y* \ {*y*} for some unique *y* ∈ *Y*, which must be isolated, since *Y* \ {*y*} is closed. We define ϕ(*x*) = *y*.

In an analogous way, <sup>F</sup>−<sup>1</sup> induces a map <sup>ψ</sup> : *<sup>Y</sup>* <sup>→</sup> *<sup>X</sup>*. We shall show that <sup>ϕ</sup> and ψ are each other's inverses. Let *x* ∈ *X* be isolated. We have seen that ϕ(*x*) must be isolated as well, and that ϕ(*x*) is defined by the equation F(*X* \ {*x*}) = *Y* \ {ϕ(*x*)}. Since <sup>F</sup> is an order isomorphism, we have *<sup>X</sup>* \ {*x*} <sup>=</sup> <sup>F</sup>−1(*<sup>Y</sup>* \ {ϕ(*x*)}). Since <sup>ϕ</sup>(*x*) is isolated, we find by definition of ψ that ψ(ϕ(*x*)) = *x*. In a similar way we find that ϕ(ψ(*y*)) = *y* for each isolated *y* ∈ *Y*. Now assume that *x* is not isolated and let *F* ∈ F2(*X*) such that *x* ∈ *F*. Then

$$\{\mathfrak{P}(\mathfrak{x})\} = \bigcap\_{O \in \mathcal{O}(\mathfrak{x})} \mathsf{F}(\overline{O}) \subseteq \bigcap \{ \mathsf{F}(\overline{O}) : O \text{ open}, F \subseteq O \}$$

$$= \mathsf{F}\left(\bigcap \{ \overline{O} : O \text{ open}, F \subseteq O \} \right) = \mathsf{F}(F), \tag{9.10}$$

where the last equality follows by completely regularity of *X*. The penultimate equality follows from the following facts. Firstly, the set \*{*<sup>O</sup>* : *<sup>O</sup>* open,*<sup>F</sup>* <sup>⊆</sup> *<sup>O</sup>*} is closed since it is the intersection of closed sets. Moreover, the intersection contains more than one point, since *F* contains two or more points and *F* ⊆ *O* for each *<sup>O</sup>*. Hence \*{*<sup>O</sup>* : *<sup>O</sup>* open,*<sup>F</sup>* <sup>⊆</sup> *<sup>O</sup>*} ∈ <sup>F</sup>2(*X*), and since <sup>F</sup> is an order isomorphism, it preserves infima, which justifies the penultimate equality. Hence ϕ(*x*) ∈ F(*F*) for each *F* ∈ F2(*X*) containing *x*. Since *x* is not isolated, ϕ(*x*) is not isolated either. Hence in a similar way, we find that <sup>ψ</sup>(ϕ(*x*)) <sup>∈</sup> <sup>F</sup>−1(*G*) for each *<sup>G</sup>* <sup>∈</sup> <sup>F</sup>2(*Y*) containing ϕ(*x*). Let *z* = ψ(ϕ(*x*)). Combining both statements, we find that *z* ∈ *F* for each *<sup>F</sup>* <sup>∈</sup> <sup>F</sup>2(*X*) such that *<sup>x</sup>* <sup>∈</sup> *<sup>F</sup>*. In other words, *<sup>z</sup>* <sup>∈</sup> \*{*<sup>F</sup>* <sup>∈</sup> <sup>F</sup>2(*X*) : *<sup>x</sup>* <sup>∈</sup> *<sup>F</sup>*}. Since *x* is not isolated, we each *O* ∈ O(*x*) contains at least two points. Hence

$$\bigcap \{ F \in \mathcal{P}\_2(X) : x \in F \} \subseteq \bigcap \{ \overline{O} : O \in \mathcal{O}(\mathbf{x}) \} = \{ \mathbf{x} \},\tag{9.11}$$

where we used complete regularity of *X* in the last equality. We conclude that *z* = *x*, so ψ(ϕ(*x*)) = *x*. In a similar way, we find that ϕ(ψ(*y*)) = *y* for each non-isolated *<sup>y</sup>* <sup>∈</sup> *<sup>Y</sup>*. We conclude that <sup>ϕ</sup> is a bijection with inverse <sup>ϕ</sup>−<sup>1</sup> <sup>=</sup> <sup>ψ</sup>.

Continuing the proof of Lemma 9.5, we have to show that if *F* ∈ F2(*X*), then ϕ[*F*] = F(*F*). Let *x* ∈ *F*. In the proof that ϕ is a bijection we already noticed that ϕ(*x*) ∈ F(*F*) if *x* is not isolated. If *x* is isolated in *X*, then we first assume that *F* has at least three points. Since {*x*} is open, *G* = *F* \ {*x*} is closed. Since *F* contains at least three points, *G* ∈ F2(*X*). So *G* is covered by *F* in F2(*X*), so F(*F*) covers F(*G*). It follows that there must be an element *yG* ∈ *Y* \F(*G*) such that

$$\mathsf{F}(F) = \mathsf{F}(G \cup \{\mathsf{x}\}) = \mathsf{F}(G) \cup \{\mathsf{y}\_G\}.\tag{9.12}$$

Both *G*∪ {*x*} and *X* \ {*x*} are elements of F2(*X*), so

$$\begin{split} \mathsf{F}(G) &= \mathsf{F}(G \cup \{\mathsf{x}\} \cap X \mid \{\mathsf{x}\}) = \mathsf{F}(G \cup \{\mathsf{x}\}) \cap \mathsf{F}(X \mid \{\mathsf{x}\}) \\ &= (\mathsf{F}(G) \cup \{\mathsf{y}\_G\}) \cap (Y \mid \{\mathsf{q}(\mathsf{x})\}), \end{split} \tag{9.13}$$

where F(*X* \ {*x*}) = *Y* \ {ϕ(*x*)} by definition of values of ϕ at isolated points. Since *x* ∈/ *G* and F preserves inclusions, this latter equation also implies F(*G*) ⊆*Y* \{ϕ(*x*)}. Hence we find

$$\mathsf{F}(G) = (\mathsf{F}(G) \cup \{\mathsf{y}\_G\}) \cap (Y \mid \{\mathsf{q}(\mathsf{x})\}) = \mathsf{F}(G) \cup (\{\mathsf{y}\_G\} \cap Y \mid \{\mathsf{q}(\mathsf{x})\}). \tag{9.14}$$

Thus we obtain {*yG*} ∩*Y* \ {ϕ(*x*)} ⊆ F(*G*), but since *yG* ∈/ F(*G*), we must have ϕ(*x*) = *yG*. As a consequence, we obtain F(*F*) = F(*G*)∪ {ϕ(*x*)}, so ϕ(*x*) ∈ F(*F*).

Summarizing, if *F* has at least three points, then ϕ(*x*) ∈ F(*F*)for *x* ∈ *F*, regardless whether *x* is isolated or not. So ϕ[*F*] ⊆ F(*F*) for each *F* ∈ F2(*X*) such that *F* has at least three points. Let *F* ∈ F2(*X*) have exactly two points. Then there are *F*1,*F*<sup>2</sup> ∈ F2(*X*) with exactly three points such that *F* = *F*<sup>1</sup> ∩*F*2. Then since ϕ is a bijection and F as an order isomorphism both preserve intersections in F2(*X*), we find

$$\mathfrak{sp}[F] = \mathfrak{sp}[F\_1 \cap F\_2] = \mathfrak{sp}[F\_1] \cap \mathfrak{sp}[F\_2] \subseteq \mathfrak{F}(F\_1) \cap \mathfrak{F}(F\_2) = \mathfrak{F}(F\_1 \cap F\_2) = \mathfrak{F}(F). \tag{9.15}$$

So <sup>ϕ</sup>[*F*] <sup>⊆</sup> <sup>F</sup>(*F*) for each *<sup>F</sup>* <sup>∈</sup> <sup>F</sup>2(*X*). In a similar way, we find <sup>ϕ</sup>−1[*G*] <sup>⊆</sup> <sup>F</sup>−1[*G*] for each *<sup>G</sup>* <sup>∈</sup> <sup>F</sup>2(*Y*). So if we substitute *<sup>G</sup>* <sup>=</sup> <sup>F</sup>(*F*), we obtain <sup>ϕ</sup>−1[F(*F*)] <sup>⊆</sup> *<sup>F</sup>*. Since <sup>ϕ</sup> is a bijection, it follows that F(*F*) = ϕ[*F*] for each *F* ∈ F2(*X*). As a consequence, ϕ induces a one-one correspondence between closed subsets of *X* and closed subsets of *Y*. Hence ϕ is a homeomorphism. This proves Lemma 9.5. -

The special case of Theorem 9.4 where *A* and *B* are commutative now follows if we combine all steps so far:


Therefore, in the commutative case we apparently obtain rather more than a weak Jordan isomorphism J : *A*sa → *B*sa; we even found an isomorphism J : *A* → *B* of C\* algebras. However, if *A* and *B* are commutative, the condition of linearity on each commutative C\*-subalgebra *C* of *A* includes *C* = *A*, so that (after complexification) weak Jordan isomorphisms are the same as isomorphisms of C\*-algebras.

We now turn to the general case, in which *A* and *B* are both noncommutative (the case where one, say *A*, is commutative but the other is not, cannot occur, since C (*A*) would be a complete lattice but C (*B*) would not). Let *D* and *E* be maximal abelian C\*-subalgebras of *A*, so that the corresponding elements of C (*A*) are maximal in the order-theoretic sense. Given an order isomorphism B : C (*A*) → C (*B*), we restrict the map B to the down-set↓*D* = C (*D*) in C (*A*) so as to obtain an order homomorphism B|*<sup>D</sup>* : C (*D*) → C (*B*). The image of C (*D*) under B must have a maximal element (since B is an order isomorphism), and so there is a maximal commutative C\* subalgebra *<sup>D</sup>*˜ of *<sup>B</sup>* such that <sup>B</sup>|*<sup>D</sup>* : <sup>C</sup> (*D*) <sup>→</sup> <sup>C</sup> (*D*˜) is an order isomorphism. Applying the previous result, we obtain an isomorphism <sup>J</sup>*<sup>D</sup>* : *<sup>D</sup>* <sup>→</sup> *<sup>D</sup>*˜ of commutative C\* algebras that induces B|*D*. The same applies to *E*, so we also have an isomorphism <sup>J</sup>*<sup>E</sup>* : *<sup>E</sup>* <sup>→</sup> *<sup>E</sup>*˜ of commutative C\*-algebras that induces <sup>B</sup>|*E*. Let *<sup>C</sup>* <sup>=</sup> *<sup>D</sup>*∩*E*, which lies in C (*A*). We now show that J*<sup>D</sup>* and J*<sup>E</sup>* coincide on *C*. There are three cases.


So assume dim(*C*) = 2. In that case, *C* = *C*∗(*e*) for some proper projection *e* ∈ P(*A*), which is equivalent to *C* being an atom in C (*A*). Recall that all our C\* algebras are unital, and that by assumption C\*-subalgebras *C* share the unit of the ambient C\*-algebra *<sup>A</sup>*, hence *<sup>C</sup>*∗(*e*) contains the unit of *<sup>A</sup>*. Hence *<sup>C</sup>*˜ <sup>≡</sup> <sup>B</sup>(*C*) = <sup>B</sup>|*D*(*C*) = <sup>B</sup>|*E*(*C*) is an atom in <sup>C</sup> (*B*), which implies that *<sup>C</sup>*˜ <sup>=</sup> *<sup>C</sup>*∗(*e*˜) for some projection ˜*e* ∈ P(*B*). If J*D*(*e*) = J*E*(*e*) we are ready, so we must exclude the case J*D*(*e*) = *e*˜, J*E*(*e*) = 1*<sup>B</sup>* −*e*˜. This exclusion again requires a case distinction:

$$\dim(eAe) = \dim(e^\perp A e^\perp) = 1;\tag{9.16}$$

$$\dim(eAe) = 1, \dim(e^\perp Ae^\perp) > 1;\tag{9.17}$$

$$\dim(eAe) > 1, \dim(e^{\perp}Ae^{\perp}) > 1,\tag{9.18}$$

where *e*<sup>⊥</sup> = 1*<sup>A</sup>* −*e*. Each of these cases is nontrivial, and we need another lemma.

Lemma 9.6. *Let C* ∈ C (*A*) *be maximal (i.e., C* ⊂ *A is maximal abelian).*


*Proof.* For the first claim dim(*eAe*) = 1 clearly implies dim(*eCe*) = 1. For the converse implication, assume *ad absurdum* that dim(*eAe*) > 1, so that there is an *a* ∈ *A* for which *eae* = λ · *e* for any λ ∈ C. If also dim(*eCe*) = 1, then any *c* ∈ *C* takes the form *c* = μ · *e*+*e*⊥*ce*<sup>⊥</sup> for some μ ∈ C. Indeed, since *c*, *e*, *e*<sup>⊥</sup> commute within *C*,

$$c = ce + ce^{\perp} = ce^{2} + c(e^{\perp})^{2} = ece + e^{\perp}ce^{\perp} = \mu e + e^{\perp}ce^{\perp},\tag{9.19}$$

where the last equality follows since *ece* ∈ *eCe*, which is spanned by *e*. This implies that *eae* ∈ *C* (where *C* is the commutant of *C* within *A*), and since *C* is maximal abelian, we have *C* = *C* , whence *eae* ∈ *C*. Now *eae* = *e*(*eae*)*e*, hence *eae* ∈ *eCe*, whence *eae* = λ · *e* for some λ ∈ C. Contradiction. According to Theorem C.169.1, the assumption dim(*C*) = 2 implies that *A* is finite-dimensional, upon which Theorem C.163 and (C.641) yield the second claim. -

Having proved Lemma 9.6, we move on the analyze the cases (9.16) - (9.18).


$$\dim((1\_B - \tilde{e})B(1\_B - \tilde{e})) > 1. \tag{9.20}$$

Applied to J*<sup>E</sup>* this gives J*E*(*e*) = *e*˜, and hence J*<sup>D</sup>* and J*<sup>E</sup>* coincide on *C* = *C*∗(*e*).

• Eq. (9.18) implies that dim(*eDe*) > 1 as well as dim(*e*⊥*Ee*⊥) > 1 (apply Lemma 9.6.1 to *D* and *E*). Since dim(*eDe*) > 1, there is some *a* ∈ *D* such that *e* and *a* = *eae* ∈ *D* are linearly independent, and similarly there is some *b* ∈ *E* such that *b* = *e*⊥*be*⊥ is linearly independent of *e*⊥. Then *a* ,*b* , *e* commute (in fact, *a b* = *b a* = 0), so that we may form the abelian C\*-algebras *C*<sup>1</sup> =*C*∗(*e*,*a* ) ⊆ *D* and *C*<sup>2</sup> = *C*∗(*e*,*b* ) ⊆ *E*, which (also containing the unit 1*A*) both have dimension at least three. We also form *C*<sup>3</sup> = *C*∗(*e*,*a* ,*b* ), which contains *C*<sup>1</sup> and *C*<sup>2</sup> and hence is at least three-dimensional, too. Because *D* and *E* are maximal abelian, *C*<sup>3</sup> must lie in both *D* and *E*. Applying the abelian case of the theorem already proved to *D* and *E*, as before, but replacing *C* used so far by *C*3, we find that J*<sup>D</sup>* and J*<sup>E</sup>* coincide on *C*<sup>3</sup> (as its dimension is > 2). In particular, J*D*(*e*) = J*E*(*e*).

To finish the proof, we first note that Theorem 9.4 holds for *A* = *B* = C by inspection, whereas the cases *A* ∼= *B* ∼= C<sup>2</sup> or ∼= *M*2(C) have already been discussed.

In all other cases we define J : *A*sa → *B*sa by putting J(*a*) = J*D*(*a*) for any maximal abelian unital C\*-subalgebra *D* containing *C* = *C*∗(*a*) and hence *a*; as we just saw, this is independent of the choice of *D*. Since each J*<sup>D</sup>* is an isomorphism of commutative C\*-algebras, J is a weak Jordan isomorphism. Finally, uniqueness of J (under the stated restriction on *A*) follows from Lemma 9.5. -

Theorem 9.4 begs the question if we can strengthen weak Jordan isomorphisms to Jordan isomorphism (i.e. invertible linear maps that preserve the Jordan product, cf. Appendix C.25). This hinges on the extendibility of weak Jordan isomorphisms to linear maps (which of course continue to preserve the Jordan product and hence are automatically Jordan isomorphisms). A general result in this direction is:

Theorem 9.7. *Let A and B be unital AW\*-algebras, where A contains no summand of type* I2*. Then there is a bijective correspondence between order isomorphisms* B : C (*A*) → C (*B*) *and Jordan isomorphisms* J : *A*sa → *B*sa*.*

This follows from Gleason's Theorem for AW\*-algebras, which we will neither state nor prove. If *A* = *B* = *B*(*H*), then the ordinary Gleason Theorem suffices to yield the crucial lemma for Wigner's Theorem for Bohr symmetries (i.e. Theorem 5.4.6):

Lemma 9.8. *Let H be a Hilbert space of dimension greater than two. Then any Bohr symmetry of* C (*B*(*H*)) *is induced by a Jordan symmetry of B*(*H*)sa*.*

*Proof.* This follows from Theorem 9.4 and Corollary 5.22, which for the case at hand turns weak Jordan isomorphisms into Jordan isomorphisms. -

We finally turn to symmetries of projection lattices. Theorem C.174 shows that for von Neumann algebras (and more generally for AW\*-algebras) *A* (without summand of type I2) and *B*, any isomorphism N : P(*A*) → P(*B*) of the corresponding orthocomplemented projection lattices (which automatically preserves arbitrary suprema) is the restriction of a unique Jordan isomorphism J : *A*sa → *B*sa.

This completes the argument to the effect that for many C\*-algebras of observables *A* (including *B*(*H*) for dim(*H*) > 1 as far as nos. 1–4 are concerned, and having dim(*H*) > 2 if we also include nos. 5–6) our six seemingly different notions of symmetry of a quantum system described by a C\*-algebra are equivalent. In particular, they are equivalent to Jordan isomorphisms, which are also the easiest ones to use, as they involve a readily identifiable part *A*sa of *A*, and (by complexification, as explained above) may even be defined on *A* itself (namely as those complex-linear isomorphisms that preserve the involution ∗ as well as the Jordan product ◦).

Putting *B* = *A* and assuming (without loss of generality) that *A* ⊆ *B*(*H*), Theorem C.175 then yields a separation of Jordan automorphisms into three disjoint classes:

Corollary 9.9. *If* J *is a Jordan symmetry of a unital C\*-algebra A* ⊆ *B*(*H*)*, then there are three mutually orthogonal projections e*1*, e*2*, e*<sup>3</sup> *in A* ∩*A such that:*


*If in addition a* → J(*a*)*e*<sup>1</sup> *is not an anti-homomorphism and a* → J(*a*)*e*<sup>2</sup> *is not a homomorphism, then e*1*, e*2*, and e*<sup>3</sup> *are uniquely determined by these conditions.*

As we shall now see, if the symmetries form a (Lie) group, then this result often justifies restricting our attention simply to homomorphisms of C\*-algebras.

#### 9.2 Unitary implementability of symmetries

There are good reasons for the dichotomy (or even trichotomy) between homomorphisms and anti-homomorphisms of C\*-algebras left by Corollary 9.9, since in physics certain discrete symmetries of quantum theory indeed give rise to antihomomorphisms: the best-known examples are time inversion *T* and charge conjugation *C* combined with space inversion (i.e. parity) *P*, giving *CP* (there are also other examples in condensed matter physics, like quantum spin flip). However, for the kind of problems mainly addressed in this book it is sufficient to restrict our attention to homomorphisms. One reason is that even if we use discrete symmetries (where the simplest non-trivial group Z<sup>2</sup> often suffices to make our point), the models we treat simply realize these symmetries as homomorphisms. Another reason is that if symmetries join to form a *connected* topological group *G* (typically a Lie group) and the maps *x* → J*<sup>x</sup>* sending *x* ∈ *G* to some Jordan symmetry J*<sup>x</sup>* of the given C\*-algebra *A* of observables form a (strongly) continuous homomorphism (see below), then the identity *e* ∈ *G* must be mapped to the identity id*A*, which of course is a homomorphism of *A*. Continuity then implies that all J*<sup>x</sup>* must be homomorphisms.

In what follows we therefore assume that *G* is a (topological) group and that we are given a (continuous) homomorphism *x* → α*x*from *G* into the group Aut(*A*) of all automorphisms of *A*; note that, given our restriction to homomorphisms, we switch notation from J to the customary symbol α. Continuity here always means *strong continuity*, in that for each *a* ∈ *A* the map *x* → α*x*(*a*) from *G* to *A* is continuous (so that the map *G*×*A* → *A* given by (*x*,*a*) → α*x*(*a*) is continuous, as usually required for group actions in a topological setting, cf. Proposition 5.35).

It follows from Theorem 5.4 (technically, from part 4 of that theorem, but "morally" from all of it, including the equivalences between all kinds of symmetries) that if *A* = *B*(*H*), then a homomorphism α : *G* → Aut(*B*(*H*)) is always implemented by a family *u*(*x*) of unitary operators on *H*, in that

$$
\alpha\_{\mathbf{x}}(a) = \mu(\mathbf{x}) a u(\mathbf{x})^\* \ (\mathbf{x} \in G). \tag{9.21}
$$

The group representation property α*x*α*<sup>y</sup>* = α*xy* does not enforce *u*(*x*)*u*(*y*) = *uxy*: indeed, as we saw in detail in §5.10 one may have a projective unitary representation *g* → *u*(*x*) of *G* on *H*. However, by Theorem 5.62 one may usually pass to a central extension *G*ˇ of *G* for which this problem does not arise (e.g., ˇ *SO*(3) = *SU*(2)). In Corollary 9.12 below (unbroken symmetry), even such a passage is not necessary.

For general C\*-algebras *A*—especially those modeling either classical systems (in which case *A* is commutative) or infinite quantum systems (where *A* is typically an infinite tensor product), one rarely has α(*a*) = *uau*<sup>∗</sup> for some *u* ∈ *A* even for single automorphisms α, let alone for a whole group of them. Instead, we settle for a weaker notion of unitary implementability, where the unitary *u* need not be in *A*.

Definition 9.10. *Let* π : *A* → *B*(*H*) *be a representation of A. An automorphism* α ∈ Aut(*A*) *is* implemented *in H if there exists a unitary operator u* : *H* → *H such that*

$$
\pi(\alpha(a)) = \mu \pi(a) \mu^\* \ (a \in A). \tag{9.22}
$$

The fundamental criterion for implementability uses the pullback α<sup>∗</sup> : *S*(*A*) → *S*(*A*) of <sup>α</sup> : *<sup>A</sup>* <sup>→</sup> *<sup>A</sup>* to the state space *<sup>S</sup>*(*A*), defined by <sup>α</sup>∗<sup>ω</sup> <sup>=</sup> <sup>ω</sup> ◦α−1; cf. §C.25.

Theorem 9.11. *An automorphism* α : *A* → *A can be implemented in the* GNS*-reprresentation* πω *defined by a state* ω *on A iff* πα∗<sup>ω</sup> *and* πω *are unitarily equivalent.*

*Proof.* Whether or not πα∗<sup>ω</sup> and πω are unitarily equivalent, we may define

$$w: H\_{\mathfrak{a}} \to H\_{\mathfrak{a}^\* \mathfrak{a}};\tag{9.23}$$

$$
\hbar \varkappa \mathfrak{a}\_{\mathfrak{o}}(a) \mathfrak{Q}\_{\mathfrak{o}} = \mathfrak{a}\_{\mathfrak{a}^\* \mathfrak{o}}(\mathfrak{a}(a)) \mathfrak{Q}\_{\mathfrak{a}^\* \mathfrak{a}}.\tag{9.24}
$$

This operator is well defined and unitary, and satisfies *w*Ωω = Ωα∗<sup>ω</sup> as well as *w*πω(*a*)*w*<sup>∗</sup> = πα∗ω(α(*a*)); these properties even characterize *w*. If πα∗<sup>ω</sup> ∼= πω, there exists a unitary *v* : *H*<sup>ω</sup> → *H*α∗<sup>ω</sup> satisfying *v*πω(*a*)*v*<sup>∗</sup> = πα∗ω(*a*), *a* ∈ *A*. Then *u* = *v*∗*w* satisfies (9.22) for π = πω. The converse is similar. -

An important special case arise if ω is invariant under α.

Corollary 9.12. *If* α∗ω = ω *(that is,* ω(α(*a*)) = ω(*a*) *for all a* ∈ *A), then* α *is implemented by a unitary operator u*<sup>ω</sup> : *H*<sup>ω</sup> → *H*<sup>ω</sup> *satisfying u*ωΩω = Ωω*. In particular, given a continuous homomorphism* α : *G* → Aut(*A*) *such that* α<sup>∗</sup> *<sup>x</sup>* ω = ω *for each x* ∈ *G, one has a family of unitaries u*ω(*x*) : *H*<sup>ω</sup> → *H*<sup>ω</sup> *that for all x* ∈ *G satisfy*

$$
\mu\_{\mathfrak{a}}(\mathfrak{x})\mathfrak{Q}\_{\mathfrak{a}} = \mathfrak{Q}\_{\mathfrak{a}};\tag{9.25}
$$

$$
\pi\_{\mathfrak{o}}(\mathfrak{a}\_{\mathfrak{x}}(a)) = \mu\_{\mathfrak{o}}(\mathfrak{x})\pi\_{\mathfrak{o}}(a)\mu\_{\mathfrak{o}}(\mathfrak{x})^\*,\tag{9.26}
$$

*and form a continuous unitary representation of G on H*ω*.*

*Proof.* One easily checks that the following operators do the job:

$$
\mu\_{a0}(\boldsymbol{x})\pi\_{a0}(\boldsymbol{a})\Omega\_{a0} = \pi\_{a0}(\alpha\_{\boldsymbol{x}}(\boldsymbol{a}))\Omega\_{a0}.\qquad\square
$$

Given some α ∈ Aut(*A*), a weak form of *spontaneous symmetry breaking* (SSB) is that some state ω—it is always a *state* that breaks a symmetry—satisfies α∗ω = ω; a stronger one states that the two equivalent conditions in Theorem 9.11 are violated, i.e., that α cannot be implemented in the GNS-representation πω(*A*) (cf. Definition 9.10). In order to be physically relevant, the weaker notion has to be supplemented with additional structure, which also guarantees that generically the weak form implies the strong one. Part of this structure involves the identification of suitable classes of states within which we define SSB; these classes are predicated on a time-evolution on *A*. We also need a symmetry *group* instead of a single automorphism α (which implicitly uses the group Z*<sup>p</sup>* = Z/*p* ·Z, where *p* is the smallest integer such that α*<sup>p</sup>* = id*A*; if no such *p* exists the group is just Z). Thus we need:


$$
\alpha\_{\mathfrak{q}} \gamma\_{\mathfrak{g}} = \gamma\_{\mathfrak{g}} \alpha\_{\mathfrak{q}} \ (\mathfrak{r} \in \mathbb{R}, \mathfrak{g} \in G). \tag{9.27}
$$

#### 9.3 Motion in space and in time

The C\*-algebras *A* we are going to use are the *quasi-local* ones introduced in §8.5 for quantum spin systems; especially recall (8.130). Also, the C\*-algebra *A* = *B*<sup>∞</sup> in §8.2 is a case in point, but this would require some changes in what follows. The last expression in (8.130) is convenient for introducing *spatial translation symmetry*

$$
\pi: \mathbb{Z}^d \to \text{Aut}(A) \tag{9.28}
$$

of <sup>Z</sup>*d*, as follows: for *<sup>x</sup>* <sup>∈</sup> <sup>Z</sup>*d*, define <sup>τ</sup>*<sup>x</sup>* : *<sup>A</sup>*<sup>Λ</sup> <sup>→</sup> *Ax*+<sup>Λ</sup> initially by

$$
\pi\_{\mathbf{x}}(b(\mathbf{y})) = b(\mathbf{x} + \mathbf{y}),
\tag{9.29}
$$

where, for given *b* ∈ *B*(*H*) and *y* ∈ Λ, the operator *b*(*y*) ∈ *A*<sup>Λ</sup> is the element ⊗*z*∈<sup>Λ</sup> *az* with *ay* = *b* and *az* = 1*<sup>H</sup>* whenever *z* = *y*. Since arbitrary elements of *A*<sup>Λ</sup> are (normlimits of) finite linear combinations of products of such operators *b*(*y*), the automorphic (and hence isometric) property of τ*<sup>x</sup>* defines its action on all of *A*<sup>Λ</sup> (if necessary by continuous extension). Note that for *a* ∈ *A*<sup>Λ</sup> the operator τ*x*(*a*) thus defined is independent of the (typically non-unique) realization of *a* in terms of the *b*(*y*), because τ*<sup>x</sup>* is an isometry. The group homomorphism property of the map (9.28) thus constructed is guaranteed by (9.29), whilst continuity is no issue since Z*<sup>d</sup>* is discrete.

Since *A*<sup>Λ</sup> = ⊗*y*∈<sup>Λ</sup> *Ay* with *Ay* = *B*(*H*), an equivalent way to define τ*<sup>x</sup>* is to use identifications id*yz* : *Ay* → *Az* (since *Ay* = *Az* = *B*(*H*)), which, taking tensor products, yield isomorphisms idΛ,<sup>Λ</sup> : *A*<sup>Λ</sup> → *A*<sup>Λ</sup> whenever some bijection Λ ∼= Λ is given. In terms of those, we simply have (τ*x*)|*A*<sup>Λ</sup> = idΛ,*x*+<sup>Λ</sup> . Either way, the maps (τ*x*)|*A*<sup>Λ</sup> extend to τ*<sup>x</sup>* : *A* → *A* by continuity. The following property then holds:

Proposition 9.13. *An automorphic action* τ *of* Z*<sup>d</sup> on a quasi-local C\*-algebra A is* asymptotically abelian *in the sense that* lim*x*→∞[*a*, τ*x*(*b*)] = 0 *for all a*,*b* ∈ *A.*

Here *x* → ∞ means that any sequence (*xn*) with |*xn*| → ∞ with respect to the Euclidean norm on Z*<sup>d</sup>* has a subsequence (*x <sup>n</sup>*) for which the stated result holds.

*Proof.* For *a* and *b* local, i.e., *a* ∈ *A*Λ(1) and *b* ∈ *A*Λ(2) this follows from Einstein locality. The general case follows by approximating *a* and *b* by local elements. -

Thus quasi-local C\*-algebras *A* satisfy the assumptions in the following theorem, which will be important in linking the various notions of SSB discussed earlier.

Theorem 9.14. *Let A be a C\*-algebra A equipped with an asymptotically abelian action* τ *of* Z*d, and let* ω *be a translation-invariant primary state on A (i.e.,* τ<sup>∗</sup> *<sup>x</sup>* ω = ω *for all x* <sup>∈</sup> <sup>Z</sup>*d). Then* Ωω *is the only translation-invariant vector in H*ω*. Moreover,*

$$\lim\_{\chi \to \infty} \mathfrak{a}(a\pi\_{\mathfrak{x}}(b)) = \mathfrak{a}(a)\mathfrak{a}(b) \ (a, b \in A);\tag{9.30}$$

$$\lim\_{\chi \to \infty} \pi\_{\mathfrak{o}}(\tau\_{\mathfrak{x}}(b)) = \mathfrak{o}(b) \cdot 1\_{H\_{\mathfrak{o}}} \ (b \in A);\tag{9.31}$$

$$\lim\_{\Lambda \uparrow \mathbb{Z}^d} |\Lambda|^{-1} \sum\_{\mathbf{x} \in \Lambda} \pi\_{\mathfrak{o}}(\pi\_{\mathbf{x}}(b)) = \mathfrak{o}(b) \cdot \mathbf{1}\_{H\_{\mathfrak{o}}} \ (b \in A) . \tag{9.32}$$

Here (9.31) and (9.32) hold in the weak operator topology on *B*(*H*ω), and the limit <sup>Λ</sup> <sup>↑</sup> <sup>Z</sup>*<sup>d</sup>* in is taken along the hypercubes <sup>Λ</sup>*<sup>N</sup>* in (8.153) as *<sup>N</sup>* <sup>→</sup> <sup>∞</sup>.

*Proof.* If ω is primary, Theorem 8.23 (or its proof) yields

$$\lim\_{\chi \to \infty} |\mathfrak{o}(a\tau\_{\mathfrak{x}}(b)) - \mathfrak{o}(a)\mathfrak{o}(\tau\_{\mathfrak{x}}(b))| = 0. \tag{9.33}$$

Translation-invariance of ω then yields (9.30), which also is a lemma for (9.31) - (9.32). Towards (9.31) we compute ω(*a*τ*x*(*b*)) in terms of the projection

$$e\_0 = \lim\_{\Lambda \uparrow \mathbb{Z}^d} |\Lambda|^{-1} \sum\_{\mathbf{x} \in \Lambda} \mu(\mathbf{x}) \tag{9.34}$$

onto the translation-invariant subspace of *H*ω, where *u* is the unitary representation of Z*<sup>d</sup>* on *H*<sup>ω</sup> from Corollary 9.12 (with *G* = Z*d*), and the limit is taken in the strong operator topology. Eq. (9.34) is a special case of von Neumann's *L*<sup>2</sup> ergodic theorem (which generalizes the Peter–Weyl–Schur relation *e*<sup>0</sup> = *<sup>G</sup> dxu*(*x*) for compact groups *G* to amenable groups like Z*<sup>d</sup>* or R*d*). Since *e*0Ωω = Ωω, we have

$$\mathcal{O}(a\mathfrak{r}\_{\mathfrak{x}}(b)) = \langle \mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}}, \mathfrak{r}\_{\mathfrak{o}\mathfrak{o}}(a)\mathfrak{r}\_{\mathfrak{o}\mathfrak{o}}(\mathfrak{r}\_{\mathfrak{x}}(b))\mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}}\rangle \tag{9.35}$$

$$= \langle \mathfrak{Q}\_{\mathfrak{w}}, \pi\_{\mathfrak{w}}(a)([\pi\_{\mathfrak{w}}(\pi\_{\mathfrak{x}}(b)), e\_{0}] + e\_{0}\pi\_{\mathfrak{w}}(b))\mathfrak{Q}\_{\mathfrak{w}}\rangle. \tag{9.36}$$

We now let *x* → ∞. The commutator then vanishes, because the weak limit of πω(τ*x*(*b*)) lies in the center of πω(*A*), which is trivial since ω is primary. The remaining term matches with (9.30) iff *e*<sup>0</sup> is one-dimensional, so that Ωω is the only translation-invariant vector in *H*ω, and *e*<sup>0</sup> = |ΩωΩω|. A similar trick then yields

$$
\pi\_{\mathfrak{o}}(\pi\_{\mathfrak{x}}(b))\pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}} = ([\pi\_{\mathfrak{o}}(\pi\_{\mathfrak{x}}(b)), \pi\_{\mathfrak{o}}(a)] + \pi\_{\mathfrak{o}}(a)([\pi\_{\mathfrak{o}}(\pi\_{\mathfrak{x}}(b)), e\_{\mathbb{O}}] + o(b)))\mathfrak{Q}\_{\mathfrak{o}}.
$$

Both commutators vanish (weakly) as *x* → ∞, proving (9.31). Similarly, write

$$
\pi\_{\mathfrak{a}\mathfrak{o}}(\pi\_{\mathfrak{x}}(b))\pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}} = \left( [\pi\_{\mathfrak{o}}(\pi\_{\mathfrak{x}}(b)), \pi\_{\mathfrak{o}}(a)] + \pi\_{\mathfrak{o}}(a)\mathfrak{u}(\mathfrak{x})\pi\_{\mathfrak{o}}(b) \right) \mathfrak{Q}\_{\mathfrak{o}},\tag{9.37}
$$

and use (9.34) and the previous formula for *e*<sup>0</sup> to prove (9.32). -

In the C\*-algebraic formalism, *dynamics* is described by a continuous homomorphism α : R → Aut(*A*), *t* → α*t*. For *A* = *B*(*H* ), where *H* is some Hilbert space (not to be confused with our earlier *H* in the quasi-local setting), Theorem 5.4 yields

$$\alpha\_l(a) = u\_l a u\_l^\* \tag{9.38}$$

for some family of unitaries *ut* ≡ *u*(*t*), *t* ∈ R. Eq. (5.268) and Proposition 5.53 then imply that the family *ut* may be redefined so as to make the map *t* → *ut* a continuous unitary representation of R on *H* . Stone's Theorem 5.73 finally gives the familiar expression for time evolution in the so-called Heisenberg picture in terms of the *Hamiltonian h*, which is a (possibly unbounded) self-adjoint operator on *H* , i.e.,

$$\alpha\_{\mathfrak{l}}(a) = e^{\it h} a e^{-\it h}. \tag{9.39}$$

For arbitrary (unital) C\*-algebras *A* one has no counterpart of Theorem 5.4, and one cannot rely on Theorem 9.11 either because there are no preferred states to begin with; such states typically require a time-evolution for their definition (see below). For quantum spin systems (still with *H* = C*<sup>n</sup>* and hence *B*(*H*) ∼= *Mn*(C)), one tries to construct the map *t* → α*<sup>t</sup>* from local approximations: with *A*<sup>Λ</sup> given by (8.129) with (8.128), we pick local Hamiltonians *h*<sup>Λ</sup> ∈ *B*(*H*<sup>Λ</sup> ) and define maps *t* → Aut(*A*<sup>Λ</sup> ) by

$$\alpha\_{\mathfrak{r}}^{\Lambda}(a) = e^{i\hbar\_{\Lambda}} a e^{-i\hbar\_{\Lambda}},\tag{9.40}$$

where *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>Λ</sup> . Letting <sup>Λ</sup> \$ <sup>Z</sup>*d*, we would then like to assemble the family <sup>α</sup><sup>Λ</sup> into a single automorphism group α : R → Aut(*A*), which describes the dynamics of the corresponding infinite quantum system. Towards this aim, we start from a *potential* (also called an *interaction*) Φ(*X*) ∈ *B*(*HX* ), which is defined for any finite sublattice *X* of Z*d*, in terms of which the local Hamiltonians *h*<sup>Λ</sup> take the form

$$h\_{\Lambda} = \sum\_{X \subseteq \Lambda} \Phi(X),\tag{9.41}$$

where the sum is over all sublattices *X* of Λ. For *nearest-neighbour interactions*, Φ(*X*) is nonzero iff *X* = {*x*, *y*} is a pair of neighbours, and in the presence of an external magnetic field one also has terms proportional to Φ({*x*}). For example, the *quantum Ising model* is defined by *<sup>H</sup>* <sup>=</sup> <sup>C</sup><sup>2</sup> and <sup>Φ</sup>({*x*, *<sup>y</sup>*}) = <sup>−</sup>*J*σ3(*x*)σ3(*y*) for nearest neighbours and Φ({*x*}) = −*B*σ1(*x*) for all *x*, where *J* > 0 and *B* ∈ R. The local Hamiltonians are therefore given by

$$h\_{\Lambda} = -J \sum\_{\langle \mathbf{xy} \rangle \in \Lambda} \sigma\_3(\mathbf{x}) \sigma\_3(\mathbf{y}) - B \sum\_{\mathbf{x} \in \Lambda} \sigma\_1(\mathbf{x}),\tag{9.42}$$

where the sum over *xy* ∈ Λ denotes summing over nearest neighbours in Λ. The expression (9.42) implicitly has so-called *free boundary conditions*, in that only neighbours inside Λ take part in *h*<sup>Λ</sup> . Alternatively, one could use *periodic boundary conditions*, which in *d* = 1 define the *quantum Ising chain*

$$h\_N = -J\left(\sum\_{\mathbf{x}=1}^{N-1} \left(\sigma\_3(\mathbf{x})\sigma\_3(\mathbf{x}+1) + \sigma\_3(N)\sigma\_3(1)\right)\right) - B\sum\_{\mathbf{x}=1}^{N} \sigma\_1(\mathbf{x}).\tag{9.43}$$

In (9.42) - (9.43) the operators σ*i*(*x*) in *A*<sup>Λ</sup> is defined as explained after (9.29). We are going to study the quantum Ising chain in detail in connection with SSB; for the moment, we just mention another popular spin model, namely the *Heisenberg model* for magnetism. This also has *H* = C2, but the local Hamiltonians are

$$h\_{\Lambda} = J \sum\_{\langle \chi y \in \Lambda \rangle} \sum\_{i=1}^{3} \sigma\_i(\chi) \sigma\_i(\chi),\tag{9.44}$$

with free boundary conditions, where *J* < 0 ( *J* > 0) yields (anti) ferromagnetism.

Although we do not have (9.38) for any *ut* ∈ *A*, we may construct α*<sup>t</sup>* as follows.

Theorem 9.15. *Let* Φ *be a* short-range potential *in that there is r* ∈ N *such that* Φ(*X*) = 0 *only if* |*x* − *y*| ≤ *r for all x*, *y* ∈ *X, and define local Hamiltonians h*<sup>Λ</sup> *by* (9.41)*. For fixed finite* <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup> and a* <sup>∈</sup> *<sup>A</sup>*<sup>Λ</sup> *, the following (norm) limit exists and defines an automorphism* <sup>α</sup>*<sup>t</sup> of* <sup>∪</sup>Λ⊂Z*dA*<sup>Λ</sup> *and hence by continuity also of A:*

$$\alpha\_{\mathbf{f}}(a) = \lim\_{N \to \infty} e^{i\hbar h\_{\mathbf{N}}} a e^{-i\hbar h\_{\mathbf{N}}},\tag{9.45}$$

*Proof.* Note that for large enough *<sup>N</sup>*, the hypercube <sup>Λ</sup>*<sup>N</sup>* contains any <sup>Λ</sup> <sup>∈</sup> <sup>P</sup>*f*(Z*d*). Take *a* ∈ *A*<sup>Λ</sup> , take Λ*N*<sup>2</sup> ⊃ Λ*N*<sup>1</sup> ⊃ Λ, and use (9.40) and (9.41) to compute

α(Λ*N*<sup>2</sup> ) *<sup>t</sup>* (*a*)−α(Λ*N*<sup>1</sup> ) *<sup>t</sup>* (*a*) = *t* 0 *ds <sup>d</sup> ds*(α(Λ*N*<sup>2</sup> ) *<sup>s</sup>* ◦α (Λ*N*1) *<sup>t</sup>*−*<sup>s</sup>* (*a*)) = ! ! ! ! *t* 0 *ds* [*h*Λ*N*<sup>2</sup> ,α (Λ*N*1) *<sup>s</sup>* ◦α (Λ*N*1) *<sup>t</sup>*−*<sup>s</sup>* (*a*)]−α(Λ*N*<sup>2</sup> ) *<sup>s</sup>* ([*h*Λ*N*<sup>1</sup> ,α(Λ*N*<sup>1</sup> ) *<sup>t</sup>*−*<sup>s</sup>* (*a*)])! ! ! ! = ! ! ! ! *t* 0 *ds*α(Λ*N*<sup>2</sup> ) *<sup>s</sup>* ([*h*Λ*N*<sup>2</sup> −*h*Λ*N*<sup>1</sup> ,α(Λ*N*<sup>1</sup> ) *<sup>t</sup>*−*<sup>s</sup>* (*a*)]) ! ! ! ! ≤ *t* 0 *ds*α(Λ*N*<sup>2</sup> ) *<sup>s</sup>* ([*h*Λ*N*<sup>2</sup> −*h*Λ*N*<sup>1</sup> ,α(Λ*N*<sup>1</sup> ) *<sup>t</sup>*−*<sup>s</sup>* (*a*)]) ≤ *t* 0 *ds*[*h*Λ*N*<sup>2</sup> −*h*Λ*N*<sup>1</sup> ,α(Λ*N*<sup>1</sup> ) *<sup>t</sup>*−*<sup>s</sup>* (*a*)] = *t* 0 *ds* ! ! ! ! ! ! <sup>∑</sup> *<sup>x</sup>*∈Λ*N*<sup>2</sup> \Λ*N*<sup>1</sup> ∑ *Xx* [Φ(*X*),α(Λ*N*<sup>1</sup> ) *<sup>t</sup>*−*<sup>s</sup>* (*a*)] ! ! ! ! ! ! <sup>≤</sup> <sup>∑</sup> *<sup>x</sup>*∈Λ*N*<sup>2</sup> \Λ*N*<sup>1</sup> ∑ *Xx t* 0 *ds*[Φ(*X*),α(Λ*N*<sup>1</sup> ) *<sup>t</sup>*−*<sup>s</sup>* (*a*)]. (9.46)

We now show that the left-hand side of the first line is a Cauchy sequence. Since

$$\alpha\_{t-s}^{(\Lambda\_{N\_1})}(a) = e^{i(t-s)\sum\_{Y \subseteq \Lambda\_{N\_1}} \Phi(Y)} a e^{-i(t-s)\sum\_{Y \subseteq \Lambda\_{N\_1}} \Phi(Y)} \in \mathcal{B}(H\_{\Lambda\_{N\_1}}),\tag{9.47}$$

which is finite-dimensional (as Λ*N*<sup>1</sup> is finite), we have a norm-convergent expansion

$$\alpha\_{\boldsymbol{t}}^{(\Lambda\_{\mathcal{N}\_{\boldsymbol{1}}})}(\boldsymbol{a}) = \boldsymbol{a} + \mathop{\rm it}\limits\_{\boldsymbol{Y}\_{1} \subseteq \Lambda\_{\mathcal{N}\_{\boldsymbol{1}}}} [\Phi(\boldsymbol{Y}\_{1}), \boldsymbol{a}] + \frac{(\boldsymbol{t})^{2}}{2!} \sum\_{\boldsymbol{Y}\_{1}, \boldsymbol{Y}\_{2} \subseteq \Lambda\_{\mathcal{N}\_{\boldsymbol{1}}}} [\Phi(\boldsymbol{Y}\_{2}), [\Phi(\boldsymbol{Y}\_{1}), \boldsymbol{a}]] + \cdots \tag{9.48}$$

Let <sup>Λ</sup>(*r*) consist of all *<sup>y</sup>* <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* for which there is some *<sup>x</sup>* <sup>∈</sup> <sup>Λ</sup> for which <sup>|</sup>*x*−*y*| ≤ *<sup>r</sup>*. Then the zeroth term *a* in (9.48) is in *A*<sup>Λ</sup> , the first is in *A*Λ(*r*), . . . , the *n*'th is in *A*Λ(*nr*). Therefore, we can find *n* = *n*(*N*1,*N*2,3) such that the only terms in (9.48) that contribute to the commutator in (9.46) are the *n*'th and beyond. Taking Λ*N*<sup>1</sup> and Λ*N*<sup>2</sup> large enough, this tail can be made arbitrarily small, so that (α(Λ*N*) *<sup>t</sup>* (*a*))*<sup>N</sup>* is a Cauchy sequence in *A*. This gives convergence of (9.45) for *a* ∈ *A*<sup>Λ</sup> , where Λ is arbitrary (but finite), yielding an automorphism α*<sup>t</sup>* in ∪<sup>Λ</sup> *A*<sup>Λ</sup> . Being an automorphism, α*<sup>t</sup>* is isometric, so that it extends to *A* by continuity. -

#### 9.4 Ground states of quantum systems

A ground state of a finite system *A*<sup>Λ</sup> = *B*(*H*<sup>Λ</sup> ) is an eigenstate of the local Hamiltonian *h*<sup>Λ</sup> with the lowest eigenvalue; because dim(*H*<sup>Λ</sup> ) < ∞, the spectrum of *h*<sup>Λ</sup> is discrete and hence local ground states exist. For infinite systems, no Hamiltonian is yet defined, so we need to define ground states in terms of the dynamics α*t*.

Definition 9.16. *Let A be a C\*-algebra with time evolution, i.e., a continuous homomorphism* α : R → Aut(*A*) *(which gives the dynamics of the underlying physical system). A* ground state *of* (*A*,α) *is a state* ω *on A such that:*

*1.* ω *is time-independent, i.e.* α∗ *<sup>t</sup>* ω = ω *(or* ω(α*t*(*a*)) = ω(*a*) *for all a* ∈ *A)* ∀*t* ∈ R*;*

*2. The generator h*<sup>ω</sup> *of the ensuing continuous unitary representation*

$$t \mapsto u\_l = e^{ith\_{ab}} \tag{9.49}$$

*of* <sup>R</sup> *on H*<sup>ω</sup> *has positive spectrum, i.e.,* <sup>σ</sup>(*h*ω) <sup>⊆</sup> <sup>R</sup>+*, or, equivalently,*

$$<\langle \Psi, h\_{\mathfrak{o}}\Psi \rangle \ge 0 \ (\Psi \in D(h\_{\mathfrak{o}})).\tag{9.50}$$

Note that the existence of the operator *h*<sup>ω</sup> is guaranteed by Corollary 9.12 and the arguments after (9.38). Since Corollary 9.12 yields

$$h\_{\alpha} \Omega\_{\alpha} = 0;\tag{9.51}$$

$$
\pi\_{\mathfrak{w}}(\mathfrak{a}\_{\mathfrak{l}}(a)) = e^{i t h\_{\mathfrak{l}0}} \pi\_{\mathfrak{w}}(a) e^{-i t h\_{\mathfrak{w}}},\tag{9.52}
$$

it follows that *h*<sup>ω</sup> is a Hamiltonian in the usual sense, implementing the Heisenbergpicture time evolution (albeit in the representation πω(*A*) rather than in *A* itself). Moreover, in view of (9.51) and the assumed positivity of σ(*h*ω), the unit vector Ωω of the GNS-representation πω induced by a ground state ω is a ground state for the Hamiltonian *h*<sup>ω</sup> in the usual sense. If ω is pure (see below for a discussion of this desirable possibility), then obviously exp(*ith*ω) ∈ πω(*A*), since the latter equals *B*(*H*ω). A deep result states that this is always the case (*Borchers Theorem*):

Theorem 9.17. *If* ω *is a ground state on A, then* exp(*ith*ω) ∈ πω(*A*) *for all t* ∈ R*.*

As we shall see, this contrasts with equilibrium states. The Heisenberg equation of motion for operators *a*(*t*) has a counterpart in the C\*-algebraic formalism, which requires a concept already encountered in §3.1, but repeated here for convenience:

Definition 9.18. *A* derivation *on a C\*-algebra A is a linear map* δ : *A* → *A with*

$$\delta(ab) = \delta(a)b + a\delta(b), \ (a, b \in A) \text{ (Leibniz rule)}.\tag{9.53}$$

*An* unbounded derivation *is a linear map* δ : Dom(δ) → *A, where the domain* Dom(δ) ⊂ *A of* δ *is a dense linear subspace of A, that satisfies the Leibniz rule.*

*An (unbounded) derivation* δ *is* symmetric *when* δ(*a*∗) = δ(*a*)∗ *for all a (in* Dom(δ)*, which must be self-adjoint in that a* ∈ Dom(δ) *iff a*<sup>∗</sup> ∈ Dom(δ)*).*

Bounded derivations are rare in classical physics; nonzero derivations of *A* =*C*0(R*d*) do not even exist, but it has plenty of *unbounded* derivations, viz. δ(*f*) = ξ *f* for some vector field ξ on R*d*. In quantum mechanics, *A* = *B*(*H* ) does have derivations, all given by δ(*a*) = *i*[*h*,*a*] for some bounded (self-adjoint) operator *h* on *H* .

Proposition 9.19. *Any continuous homomorphism* α : R → Aut(*A*) *on any C\* algebra A defines an unbounded symmetric derivation* δ *on A by the norm limit*

$$\delta(a) = \frac{d}{dt} \alpha\_t(a)\_{|t=0} \equiv \lim\_{t \to 0} \frac{a\_t(a) - a}{t},\tag{9.54}$$

*where* Dom(δ) *consists of all a* ∈ *A for which this limit exists. Moreover, this domain is stable under* α*<sup>t</sup> in that if a* ∈ Dom(δ)*, then* α*t*(*a*) ∈ Dom(δ) *(t* ∈ R*).*

The proof is an elementary verification (cf. Theorem 5.73). On *H*<sup>ω</sup> we then have

$$
\pi\_{\mathfrak{w}}(\mathcal{S}(a)) = i[h\_{\mathfrak{w}}, \pi\_{\mathfrak{w}}(a)], \tag{9.55}
$$

which, then, is "Heisenberg's equation of motion revisited." One may also reformulate Definition 9.16 in terms of the derivation δ associated to α by (9.54):

Proposition 9.20. *A state* ω ∈ *S*(*A*) *is a ground state for given dynamics* α *iff*

$$-i a \mathfrak{o} (a^\* \delta (a)) \ge 0 \ (a \in \text{Dom}(\delta)).\tag{9.56}$$

*Proof.* If ω is a ground state according to Definition 9.16, we may use (9.55), (C.196), (9.51), and finally (9.50) to compute

$$-i\mathfrak{o}(a^\*\delta(a)) = -i\langle \mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(a^\*\delta(a))\mathfrak{Q}\_{\mathfrak{o}}\rangle = \langle \mathfrak{Q}\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(a)^\*[h\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(a)]\mathfrak{Q}\_{\mathfrak{o}}\rangle$$

$$= \langle \pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}}, h\_{\mathfrak{o}}\pi\_{\mathfrak{o}}(a)\mathfrak{Q}\_{\mathfrak{o}}\rangle \ge 0. \tag{9.57}$$

Conversely, we first show that if ω satisfies (9.56), then it is α*t*-invariant. We initially assume *a* = *a*∗, so that δ(*a*)∗ = δ(*a*∗) = δ(*a*), as δ is symmetric by construction. Since ω is a state, one has ω(*b*∗) = ω(*b*) for any *b* ∈ *A*, so taking *b* = δ(*a*)*a*, using (9.56) just in that ω(*a*∗δ(*a*)) ∈ *i*R, we obtain ω(δ(*a*)*a*) = −ω(*a*δ(*a*)). Hence

$$
\rho(\delta(a^2)) = 0,\tag{9.58}
$$

by (9.53), so also <sup>ω</sup>(δ(α*s*(*a*)2)) = 0, *<sup>s</sup>* <sup>∈</sup> <sup>R</sup>. With (9.54), we find

$$\begin{aligned} 0 &= \int\_0^u ds \, \mathfrak{o}(\delta(\mathfrak{a}\_s(a)^2)) = \int\_0^u ds \, \mathfrak{o}\left(\frac{d}{dt}\mathfrak{a}\_t(\mathfrak{a}\_s(a)^2)\_{|t=0}\right) \\ &= \int\_0^u ds \, \frac{d}{dt}\mathfrak{o}(\mathfrak{a}\_{t+s}(a)^2))\_{|t=0} = \int\_0^u ds \, \frac{d}{ds}\mathfrak{o}(\mathfrak{a}\_s(a)^2)) = \mathfrak{o}(\mathfrak{a}\_u(a^2)) - \mathfrak{o}(a^2). \end{aligned}$$

Hence ω(α*u*(*a*2)) = ω(*a*2) for each *u* > 0 (and analogously for each *u* < 0), whenever *a*<sup>∗</sup> = *a*, i.e., ω(α*u*(*b*) = ω(*b*) for each *b* ≥ 0. But any *b* ∈ *A* may be written as a sum of at most four positive elements, so ω ◦α*<sup>u</sup>* = ω for all *u* ∈ R. We therefore have a Hamiltonian *h*ω, whose positivity follows from (9.57), ran backwards. -

#### 9.5 Ground states and equilibrium states of classical spin systems

Thermal equilibrium states are arguably physically more relevant than ground states, as the latter rely on the idealization of temperature zero. Since in statistical mechanics infinite systems are used to approximate very large ones, it will be of particular interest to define equilibrium states in infinite volume. If only to highlight contrasts with quantum theory, we take a long run and start with the classical case.

Classical spin systems on a lattice are defined by a single-site configuration space *n* ∼= {0,1,...,*n*}, where *m* ∈ *n* may either be interpreted as some spin-like degree of freedom (as in the Ising model, where *n* = 2) or as the number of (structureless) particles occupying a given site (in which case one has a *lattice gas*). As in (C.310), for any finite sublattice <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*d*, the local algebra of observables is given by

$$A\_{\Lambda}^{(c)} = C(\underline{\eta}^{\Lambda}),\tag{9.59}$$

where *<sup>n</sup>*<sup>Λ</sup> <sup>=</sup> *<sup>C</sup>*(Λ,*n*) consists of all functions *<sup>s</sup>* : <sup>Λ</sup> <sup>→</sup> *<sup>n</sup>*. For finite <sup>Λ</sup> this is a finite set (of cardinality *n*|Λ<sup>|</sup> ), so that all functions in question are continuous and hence *C*(*n*<sup>Λ</sup> ) just stands for the commutative C\*-algebra of *all* functions from *n*<sup>Λ</sup> to C. If <sup>Λ</sup><sup>1</sup> <sup>⊆</sup> <sup>Λ</sup>(2) , we have maps ι (*c*) <sup>Λ</sup>1Λ(2) : *<sup>A</sup>*(*c*) <sup>Λ</sup><sup>1</sup> <sup>→</sup> *<sup>A</sup>*(*c*) <sup>Λ</sup>(2), written *<sup>f</sup>*<sup>1</sup> → *<sup>f</sup>*2, which are given by

$$f\_2(\mathbf{s}) = f\_1(\mathbf{s}\_{|\Lambda\_1}),\tag{9.60}$$

where *<sup>s</sup>*:Λ(2) <sup>→</sup> *<sup>n</sup>*. As these maps are injective, the ensuing inductive limit is simply

$$A^{(c)} = \cup\_{\Lambda \subset \mathbb{Z}^d} A^{(c)}\_{\Lambda} \cong \mathbb{C}\left(\underline{n}^{\mathbb{Z}^d}\right),\tag{9.61}$$

where *n*Z*<sup>d</sup>* <sup>=</sup> <sup>∏</sup>*x*∈Z*<sup>d</sup> <sup>n</sup>* is endowed with the product topology and hence (by Tychonoff's theorem) is compact (for *n* = 2,*d* = 1 this is a model of the Cantor set).

As in the quantum case, local Hamiltonians are defined via an *interaction* Φ, which now is an assignment *<sup>X</sup>* → <sup>Φ</sup>(*X*), where *<sup>X</sup>* <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* is finite and <sup>Φ</sup>(*X*) <sup>∈</sup> *<sup>A</sup>*(*c*) *X* . If *<sup>X</sup>* <sup>⊂</sup> *<sup>Y</sup>*, we regard <sup>Φ</sup>(*X*) an an element in *<sup>A</sup>*(*c*) *<sup>Y</sup>* through the inclusion *<sup>A</sup>*(*c*) *<sup>X</sup>* <sup>⊂</sup> *<sup>A</sup>*(*c*) *<sup>Y</sup>* , indicating this explicitly by writing <sup>Φ</sup>(*X*)*<sup>Y</sup>* <sup>∈</sup> *<sup>A</sup>*(*c*) *<sup>Y</sup>* . We then define *<sup>h</sup>*<sup>Λ</sup> <sup>∈</sup> *<sup>A</sup>*(*c*) <sup>Λ</sup> by

$$h\_{\Lambda} = \sum\_{X \subset \Lambda} \Phi(X)\_{\Lambda},\tag{9.62}$$

where the the sum is over all subsets *X* of Λ. For example, the Ising Hamiltonian

$$h\_{\Lambda}(\mathbf{s}) = -J \sum\_{\langle ij \rangle\_{\Lambda}} s\_i s\_j - B \sum\_{i \in \Lambda} s\_i,\tag{9.63}$$

where the sum is over nearest neighbours in Λ, and we assume 2 = {−1,1} (rather than the usual c-bit {0,1}), comes from the following potential:


As in (9.41), the prescription (9.62) has free boundary conditions, in that it only involves spins inside Λ. Another possibility is to fix a "boundary" spin configuration *<sup>b</sup>* <sup>∈</sup> *<sup>n</sup>*Z*<sup>d</sup>* , and define *hb* <sup>Λ</sup> <sup>∈</sup> *<sup>A</sup>*(*c*) <sup>Λ</sup> by

$$h\_{\Lambda}^{b} = \sum\_{X \subset \mathbb{Z}^d, |X| < \infty, X \cap \Lambda \neq \emptyset} \Phi(X)\_{\Lambda}^{b} \,. \tag{9.64}$$

This involves some new notation Φ(*X*)*<sup>b</sup>* <sup>Λ</sup> , which means the following. In principle, <sup>Φ</sup>(*X*) <sup>∈</sup> *<sup>A</sup>*(*c*) *<sup>X</sup>* is a function on *nX* . We now turn Φ(*X*) into a function Φ(*X*)*<sup>b</sup>* <sup>Λ</sup> on *n*<sup>Λ</sup> (so that *hb* <sup>Λ</sup> is a function on *<sup>n</sup>*<sup>Λ</sup> as required): for given *<sup>s</sup>*:<sup>Λ</sup> <sup>→</sup> *<sup>n</sup>* and given *<sup>b</sup>* : <sup>Z</sup>*<sup>d</sup>* <sup>→</sup> *<sup>n</sup>* we define *s* : *X* → *n* by putting *s* = *s* on *X* ∩Λ and *s* = *b* on the remainder of *X* (which is *<sup>X</sup>* <sup>∩</sup>Λ*c*, with <sup>Λ</sup>*<sup>c</sup>* <sup>=</sup> <sup>Z</sup>*d*\Λ). Then

$$
\Phi(X)^b\_\Lambda(s) = \Phi(X)(s'). \tag{9.65}
$$

Physically, this simply means that those spins outside Λ that interact with spins inside Λ are set at a fixed value determined by the boundary condition *b*. For example, consider the Ising model in *d* = 1. If we take Λ = {2,3}, then from (9.62) we obtain *h*<sup>Λ</sup> = −*Js*2*s*<sup>3</sup> − *B*(*s*<sup>2</sup> + *s*3); spins outside Λ do not contribute. From (9.64), on the other hand, we obtain *h<sup>b</sup>* <sup>Λ</sup> = *h*<sup>Λ</sup> −*J*(*b*1*s*<sup>2</sup> +*s*3*b*4). Although the boundary condition *b* is arbitrary, one may think of simple choices like *bi* = 1 or −1 for each *i*.

We may actually rewrite (9.64) as a difference between Hamiltonians with free boundary conditions. To do so, for given finite Λ we pick some finite Λ ⊃ Λ large enough that it contains all spins outside Λ that interact with spins inside Λ (provided this is possible). With the conventional notation *<sup>h</sup>*<sup>Λ</sup> (*s*|*b*) <sup>≡</sup> *<sup>h</sup><sup>b</sup>* <sup>Λ</sup> (*s*), this yields

$$h\_{\Lambda}(\mathbf{s}|\mathbf{b}) = h\_{\Lambda'}(\mathbf{s}, \mathbf{b}) - h\_{\Lambda' \backslash \Lambda}(\mathbf{b}) = \sum\_{X' \subset \Lambda'} \Phi(X')\_{\Lambda'}(\mathbf{s}, \mathbf{b}) - \sum\_{Y \subset \Lambda' \backslash \Lambda} \Phi(Y)\_{\Lambda' \backslash \Lambda}(\mathbf{b}).$$

Analogous to (9.65), the notation Φ(*X* )<sup>Λ</sup>(*s*,*b*) here means Φ(*X* )<sup>Λ</sup>(*s* ), for the function *s* :Λ → *n* that on Λ ⊂ Λ coincides with *s*:Λ → *n*, whilst on (Λ \Λ) ⊂ Λ it coincides with the restriction of *b* to Λ \Λ. Thus we may also write

$$h\_{\Lambda}(s|b) = \lim\_{\Lambda' \uparrow \mathbb{Z}^d} (h\_{\Lambda'}(s,b) - h\_{\Lambda' \backslash \Lambda}(b)),\tag{9.66}$$

although neither *<sup>h</sup>*Z*<sup>d</sup>* (*s*,*b*) nor *<sup>h</sup>*Z*<sup>d</sup>* \<sup>Λ</sup> (*b*) makes sense by itself. *Periodic* boundary conditions for local Hamiltonians may be defined for arbitrary interactions Φ and special lattices. For example, the Ising chain in *d* = 1 has local Hamiltonians

$$h\_{\{1,2,\dots,n\}}^{pbc}(s) = J\left(s\_1 s\_n + \sum\_{i=1}^{n-1} s\_i s\_{i+1}\right) - B \sum\_{i=1}^n s\_i. \tag{9.67}$$

Naively, a *ground state* of a *finite* classical spin system, i.e., a system of the above kind defined on a *fixed* finite lattice <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*d*, is a spin configuration *<sup>s</sup>*<sup>0</sup> <sup>∈</sup> *<sup>n</sup>*<sup>Λ</sup> that minimizes the local Hamiltonian *h*<sup>Λ</sup> (9.62), or its counterpart (9.64), that is,

$$h\_{\Lambda}(\mathbf{s}\_0) \le h\_{\Lambda}(\mathbf{s}),\tag{9.68}$$

for all *<sup>s</sup>* <sup>∈</sup> *<sup>n</sup>*<sup>Λ</sup> . For example, if <sup>Λ</sup> is a hypercube <sup>Λ</sup>*N*, then the Ising model (9.63) has a unique ground state for *B* > 0, namely *s*0(*x*) = 1 for all *x* ∈ Λ, whereas it has two ground states *s* ± <sup>0</sup> for *B* = 0, given by *s* ± <sup>0</sup> (*x*) = ±1 for all *x*. Ground states of finite classical systems always exist (since the space on which *h*<sup>Λ</sup> is finite), but they are not necessarily unique; we just gave a counterexample! The same is true for quantum theory, since for *B* = 0 also the quantum Ising model (9.42) has two degenerate symmetry-breaking ground states. Nonetheless, this case is special, since for nonzero small values of *B* the ground state of the quantum Ising model is unique for finite <sup>Λ</sup>, whereas on the infinite lattice <sup>Z</sup>*<sup>d</sup>* it is degenerate (cf. §10.7).

The definition of ground states of *infinite* classical spin systems is just slightly more involved: for local Hamiltonians *h*<sup>Λ</sup> with free boundary conditions defined by an interaction <sup>Φ</sup> a la (9.62), a ground state is a point ` *<sup>s</sup>*<sup>0</sup> <sup>∈</sup> *<sup>n</sup>*Z*<sup>d</sup>* for which

$$h\_{\Lambda}(\mathbf{s}\_{0|\Lambda}) \le h\_{\Lambda}(\mathbf{s}\_{|\Lambda}),\tag{9.69}$$

for any finite <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* and any spin configuration *<sup>s</sup>* <sup>∈</sup> *<sup>n</sup>*Z*<sup>d</sup>* . Alternatively, one may ask

$$h\_A^{s\_0}(s\_0) \le h\_A^{s\_0}(s),\tag{9.70}$$

for all finite <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* and all spin configurations *<sup>s</sup>* <sup>∈</sup> *<sup>n</sup>*Z*<sup>d</sup> that coincide with s*<sup>0</sup> *outside* Λ, where *h s*0 <sup>Λ</sup> stands for (9.64) with *b* = *s*0. In other words, *s*<sup>0</sup> provides a boundary condition *b*, which is fixed for all *s* that compete with *s*<sup>0</sup> in minimizing the local Hamiltonian *h b*=*s*0 <sup>Λ</sup> . Both definitions give the usual two ground states for the Ising model with *B* = 0 (in which all spins are either "up" or "down"), but the second one also opens the possibility of *domain walls*, where infinite chains of "spin up" alternate with infinite chains of "spin down", and similarly in higher *d*.

If different ground states in the above ("pure") sense exist, we may reinterpret such states *s*<sup>0</sup> as Dirac measures δ*s*<sup>0</sup> on the space *n*<sup>Λ</sup> of all spin configurations on Λ, and may also allow convex combinations of ground states as ground states. This, as well as the analogy with Definition 9.16 (in which no purity condition is imposed) inspires a more liberal definition of a ground state, which is predicated on Boltzmann's idea that a state of a classical system of the kind we consider is a probability measure μ<sup>0</sup> <sup>Λ</sup> on *<sup>n</sup>*<sup>Λ</sup> , and likewise for *<sup>n</sup>*Z*<sup>d</sup>* . In the C\*-algebraic formalism we use, this follows from (9.61) and the identification of states on *C*(*X*) with completely regular probability measures on *X* (assumed to be a compact Hausdorff space, cf. §B.5). A state μ on *C*(*n*Z*<sup>d</sup>* ), i.e., a probability measure on *n*Z*<sup>d</sup>* , induces a state on each local algebra *C*(*n*<sup>Λ</sup> ), i.e., a probability measure μΛ on *n*<sup>Λ</sup> simply by restriction, since

$$\mathcal{C}(\underline{\underline{n}}^{\mathcal{A}}) \subset \mathcal{C}(\underline{\underline{n}}^{\mathcal{Z}^d}) \tag{9.71}$$

through the injection (9.60), according to which *<sup>f</sup>*<sup>Λ</sup> <sup>∈</sup> *<sup>C</sup>*(*n*<sup>Λ</sup> ) has image *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*n*Z*<sup>d</sup>* ) defined by *f*(*s*) = *f*<sup>Λ</sup> (*s*|<sup>Λ</sup> ). The measure μΛ , then, is given in terms of μ by

$$
\mu\_{\Lambda}(f\_{\Lambda}) = \mu(f);\tag{9.72}
$$

the corresponding probability distribution *p*<sup>Λ</sup> (i.e., *p*<sup>Λ</sup> (*s*) = μΛ ({*s*})) is given by

$$p\_{\Lambda}(\mathbf{s}) = \mu\left(\{\mathbf{s}' \in \underline{\mathbf{n}}^{\mathbb{Z}^d} \mid \mathbf{s}'\_{|\Lambda} = \mathbf{s}\}\right), \ s \in \underline{\mathbf{n}}^{\Lambda}.\tag{9.73}$$

The family of probability measures (μΛ ) defined by <sup>μ</sup> is *consistent* in that if <sup>Λ</sup>(1) <sup>⊂</sup> <sup>Λ</sup>(2) and *<sup>f</sup>*<sup>1</sup> <sup>∈</sup> *<sup>C</sup>*(*n*Λ(1) ) and *<sup>f</sup>*<sup>2</sup> <sup>∈</sup> *<sup>C</sup>*(*n*Λ(2) ) are related as in (9.60), then

$$
\mu\_{\Lambda^{(1)}}(f\_1) = \mu\_{\Lambda^{(2)}}(f\_2). \tag{9.74}
$$

Conversely, a consistent family of probability measures (μΛ ) defines a unique probability measure μ on *n*Z*<sup>d</sup>* which induces the given family through (9.72).

Definition 9.21. *For given finite* <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*d, a probability measure* <sup>μ</sup><sup>0</sup> <sup>Λ</sup> *on n*<sup>Λ</sup> *is a* ground state *of a local Hamiltonian h*<sup>Λ</sup> *(with free boundary conditions) if, in terms of the probabilities p*<sup>0</sup> <sup>Λ</sup> (*s*) = μ<sup>0</sup> <sup>Λ</sup> ({*s*})*, for any probability measure* μΛ *on n*<sup>Λ</sup> *,*

$$\sum\_{s \in \underline{\mathfrak{n}}^{\Lambda}} p\_{\Lambda}^{0}(s) \, h\_{\Lambda} \le \sum\_{s \in \underline{\mathfrak{n}}^{\Lambda}} p\_{\Lambda}(s) \, h\_{\Lambda}. \tag{9.75}$$

*A probability measure* μ<sup>0</sup> *on n*Z*<sup>d</sup> is a* ground state *for some interaction* Φ *if* (9.75) *holds for any probability measure* μ *on n*Z*<sup>d</sup> and any finite subset* <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*d, where this time p*<sup>0</sup> <sup>Λ</sup> *(and analogously p*<sup>Λ</sup> *) is defined by* (9.73)*.*

In particular, convex sums of pure ground states are ground states in this more general sense, so that, if all pure ground states break some symmetry (as is the case for the Z2-symmetry *s* → −*s* of the Ising model at *B* = 0), symmetric convex sums will restore the symmetry. The set of all ground states of a given interaction Φ is a convex set, whose extreme points are the pure ground states (at least, under suitable hypotheses on Φ). This leads to a discussion of SSB similar to the quantum case.

In the following discussion of equilibrium states, we use the notation

$$\Pr(X) \cong \mathbb{S}(\mathcal{C}(X))\tag{9.76}$$

for the compact convex set of all completely regular probability measures on *X*, which as above will either be the finite set *n*<sup>Λ</sup> (with discrete topology)—on which of course any probability measure is completely regular—or the compact space *n*Z*<sup>d</sup>* . In the first case we may as well use probability *distributions p*<sup>Λ</sup> (instead of probability measures) on *n*<sup>Λ</sup> . In the second, we could also use Baire measures.

Given an interaction Φ and the ensuing family (9.62) of local Hamiltonians *h*<sup>Λ</sup> , we define the local *energy* for each finite <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* as a function <sup>E</sup><sup>Λ</sup> : Pr(*n*<sup>Λ</sup> ) <sup>→</sup> <sup>R</sup> by

$$\mathcal{E}\_{\Lambda}(p\_{\Lambda}) = \sum\_{s \in \mathfrak{g}^{\Lambda}} p\_{\Lambda}(s) h\_{\Lambda}(s). \tag{9.77}$$

Of course, this is just the expectation value of the Hamiltonian in the state *p*<sup>Λ</sup> . The local *entropy <sup>S</sup>*<sup>Λ</sup> : Pr(*n*<sup>Λ</sup> ) <sup>→</sup> <sup>R</sup> is a more subtle concept; rather than the expectation value of some (local) observable, it specifies a property of the probability distribution itself. With Boltzmann's constant *kB*, we have

$$S\_{\Lambda}(p\_{\Lambda}) = -k\_{B} \sum\_{s \in \underline{\mu}^{\Lambda}} p\_{\Lambda}(s) \ln(p\_{\Lambda}(s)). \tag{9.78}$$

Note that *S*<sup>Λ</sup> (*p*<sup>Λ</sup> ) ≥ 0, with equality iff *p*<sup>Λ</sup> is a pure state (i.e., *p*<sup>Λ</sup> is supported at a single spin configuration). The local *free energy* F<sup>β</sup> <sup>Λ</sup> : Pr(*n*<sup>Λ</sup> ) <sup>→</sup> <sup>R</sup> is defined as

$$\mathcal{F}\_{\Lambda}^{\mathcal{B}} = \mathcal{E}\_{\Lambda}^{\circ} - T\mathcal{S}\_{\Lambda},\tag{9.79}$$

where β = 1/*kBT*. A *local equilibrium state*, then, is a probability distribution *p* β Λ that minimizes the free energy (for fixed temperature *T*).

Theorem 9.22. *For each T* > 0*, there is a* unique *local equilibrium state, given by the* Boltzmann distribution *(and associated* partition function*)*

$$p\_{\Lambda}^{\beta}(s) = (Z\_{\Lambda}^{\beta})^{-1} e^{-\beta h\_{\Lambda}(s)};\tag{9.80}$$

$$Z\_{\Lambda}^{\beta} = \sum\_{s' \in \underline{\mathfrak{u}}^{\Lambda}} e^{-\beta h\_{\Lambda}(s')}.\tag{9.81}$$

The associated *free energy in equilibrium* is then given by

$$F\_{\Lambda}^{\beta} = \mathcal{P}\_{\Lambda}^{\beta}(p\_{\Lambda}^{\beta}) = -\mathcal{B}^{-1} \ln Z\_{\Lambda}^{\beta}. \tag{9.82}$$

*Proof.* The claim follows from the fact that any *<sup>p</sup>*<sup>Λ</sup> <sup>∈</sup> Pr(*n*<sup>Λ</sup> ) satisfies the inequality

$$
\mathcal{A}\mathcal{P}\_{\Lambda}^{\beta}(p\_{\Lambda}) \ge -\beta^{-1}\ln Z\_{\Lambda}^{\beta},\tag{9.83}
$$

with equality iff *p* = *p* β <sup>Λ</sup> , i.e., using (9.79), (9.77), and (9.78), we need to show that

$$\sum\_{s \in E^{\Lambda}} p(s)(h\_{\Lambda}(s) + \mathcal{J}^{-1} \ln p(s)) + \mathcal{J}^{-1} \ln Z\_{\Lambda}^{\beta} \ge 0. \tag{9.84}$$

Using (9.80), for each *<sup>s</sup>* <sup>∈</sup> *<sup>E</sup>*<sup>Λ</sup> we obtain

$$-\beta h\_{\Lambda}(s) = \ln Z\_{\Lambda}^{\beta} + \ln p\_{\Lambda}^{\beta}(s). \tag{9.85}$$

Substituting this in (9.84), using ∑*<sup>s</sup> p*(*s*) = 1, omitting the ensuing prefactor β <sup>−</sup>1, and noting that *p* β <sup>Λ</sup> (*s*) > 0 for all *s*, the inequality (9.84) to be proved becomes

$$\sum\_{s \in E^A} p(s) \ln \left( \frac{p(s)}{p\_A^{\beta}(s)} \right) \ge 0. \tag{9.86}$$

Hence we need to prove the inequality

$$\sum\_{s \in E^{\Lambda}} p\_{\Lambda}^{\beta}(s) \cdot \left( \frac{p(s)}{p\_{\Lambda}^{\beta}(s)} \right) \ln \left( \frac{p(s)}{p\_{\Lambda}^{\beta}(s)} \right) \ge 0,\tag{9.87}$$

with equality iff *p*(*s*) = *p* β <sup>Λ</sup> (*s*) for all *s*. Let us note that the function *f*(*x*) = *x* ln*x* is strictly convex for all *x* ≥ 0, that is, for any finite set of numbers *p* (*s*) ∈ (0,1) with ∑*<sup>s</sup> p* (*s*) = 1 and any set of positive real numbers (*xs*)*<sup>s</sup>* ≥ 0, we have

$$\sum\_{s} p'(s)f(\mathbf{x}\_s) \ge f\left(\sum\_{s} p'(s)\mathbf{x}\_s\right),\tag{9.88}$$

with equality iff all numbers *xs* are the same. Applying this with *p* (*s*) = *p* β <sup>Λ</sup> (*s*) and *xs* = *p*(*s*)/*p* β <sup>Λ</sup> (*s*), so that *p* (*s*)*xs* = *p*(*s*) and hence ∑*<sup>s</sup> p* (*s*)*xs* = ∑*<sup>s</sup> p*(*s*) = 1, which makes the right-hand side of (9.88) vanish since ln(1) = 0, finally leads to (9.87). Equality arises iff *p*(*s*)/*p* β <sup>Λ</sup> (*s*) equals the same numer *c* for all *s*; summing over all *s* forces *c* = 1, so that one has equality iff *p*(*s*) = *p* β <sup>Λ</sup> (*s*) for all *s*, as desired. -

Neither the local Hamiltonians (9.62) nor the local partition functions (9.81) have a limit as <sup>Λ</sup> <sup>↑</sup> <sup>Z</sup>*d*. A precise definition equilibrium states of infinite classical systems was given in 1968 by Dobrushin and by Lanford and Ruelle (DLR).

Definition 9.23. *For fixed inverse temperature* β ∈ (0,∞) *and fixed interaction* Φ*, a* Gibbs measure μ<sup>β</sup> *is a (Baire = regular Borel) probability measure on n*Z*<sup>d</sup> such that for each finite* <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup> and each pair* (*s*,*b*) *of a spin configuration s*:<sup>Λ</sup> <sup>→</sup> *<sup>n</sup> plus boundary condition b* : <sup>Λ</sup>*<sup>c</sup>* <sup>→</sup> *n, the conditional probability* <sup>μ</sup><sup>β</sup> (s|b) *for the events*

$$\mathbf{s} = \{ \mathbf{s}' \in \underline{\mathfrak{u}}^{\mathbb{Z}^d} \mid s'\_{|\Lambda} = \mathbf{s} \} \subset \underline{\mathfrak{u}}^{\mathbb{Z}^d};\tag{9.89}$$

$$\mathbf{b} = \{ \mathbf{s}'' \in \underline{\mathfrak{u}}^{\mathbb{Z}^d} \mid s\_{|\Lambda^c}'' = b \} \subset \underline{\mathfrak{u}}^{\mathbb{Z}^d},\tag{9.90}$$

*is given in terms of the local Hamiltonian h*<sup>Λ</sup> (*s*|*b*) *as defined by* (9.66) *by*

$$
\mu^{\beta}(\mathbf{s}|\mathbf{b}) = (Z\_{\Lambda}^{\beta}(b))^{-1} e^{-\beta h\_{\Lambda}(s|b)},\tag{9.91}
$$

$$Z\_{\Lambda}^{\beta}(b) = \sum\_{s \in \underline{\mu}^{\Lambda}} e^{-\beta h\_{\Lambda}(s|b)}. \tag{9.92}$$

Recall that <sup>μ</sup><sup>β</sup> (s|b) = <sup>μ</sup><sup>β</sup> (s∩b)/μ<sup>β</sup> (b), where <sup>s</sup>∩<sup>b</sup> <sup>=</sup> {*sb*} consists of the single spin configuration *sb* : <sup>Z</sup>*<sup>d</sup>* <sup>→</sup> *<sup>n</sup>* that coincides with *<sup>s</sup>* on <sup>Λ</sup> and coincides with *<sup>b</sup>* on <sup>Λ</sup>*c*. Thus we may write <sup>μ</sup><sup>β</sup> (s|b) = *<sup>p</sup>*<sup>β</sup> (*sb*)/μ<sup>β</sup> (b), where *<sup>p</sup>*<sup>β</sup> (*s*) = <sup>μ</sup><sup>β</sup> ({*s*}) as usual.

It was initially unclear how to generalize this highly fruitful definition of equilibrium states in classical statistical mechanics to the quantum case, where conditional probabilities are not well defined (this was eventually resolved, however, through Definition 10.9 below). Thus a different (equally fruitful) approach to equilibrium states of (infinite) quantum systems was developed, to which we now turn.

#### 9.6 Equilibrium (KMS) states of quantum systems

For *finite* quantum spin systems we have expressions for the energy Eˆ<sup>β</sup> <sup>Λ</sup> , the entropy *S*ˆ <sup>Λ</sup> , and the free energy Fˆ <sup>Λ</sup> that are analogous to their classical counterparts (9.77), (9.78), and (9.79). In particular, these quantities are functions on the state space *S*(*A*<sup>Λ</sup> ). Since *A*<sup>Λ</sup> = *B*(*H*<sup>Λ</sup> ), where we assume that *H* and hence *H*<sup>Λ</sup> is finitedimensional, each state ωΛ ∈ *S*(*A*<sup>Λ</sup> ) is given by a density operator ρΛ , so that

$$
\hat{\mathcal{E}}\_{\Lambda}(\mathfrak{o}\_{\Lambda}) = \mathfrak{o}\_{\Lambda}(h\_{\Lambda}) = \text{Tr}(\mathfrak{p}\_{\Lambda}h\_{\Lambda});\tag{9.93}
$$

$$
\hat{S}\_{\Lambda}(\mathfrak{o}\_{\Lambda}) = -k\_{\mathcal{B}} \text{Tr} (\mathfrak{p}\_{\Lambda} \ln \mathfrak{p}\_{\Lambda});\tag{9.94}
$$

$$
\mathcal{A}\_{\Lambda}^{\mathfrak{B}} = \mathring{\mathcal{A}}\_{\Lambda} - T\mathring{\mathcal{S}}\_{\Lambda}.\tag{9.95}
$$

Defining a local equilibrium state as a density matrix ρ<sup>β</sup> <sup>Λ</sup> that minimizes the free energy (for fixed *T*), we have the following quantum analogue of Theorem 9.22:

Theorem 9.24. *For each T* > 0*, there is a* unique *local equilibrium state* ω<sup>β</sup> <sup>Λ</sup> *, viz.*

$$\mathfrak{op}\_{\Lambda}^{\beta}(a) = \text{Tr}\left(\mathfrak{p}\_{\Lambda}^{\beta} a\right);\tag{9.96}$$

$$
\rho\_{\Lambda}^{\beta} = (\mathcal{Z}\_{\Lambda}^{\beta})^{-1} e^{-\beta h\_{\Lambda}};\tag{9.97}
$$

$$\mathcal{Q}\_{\Lambda}^{\beta} = \text{Tr}\left(e^{-\beta h\_{\Lambda}}\right). \tag{9.98}$$

Accordingly, the free energy *F*<sup>β</sup> <sup>Λ</sup> in equilibrium is given by

$$F\_{\Lambda}^{\beta} = \hat{\mathcal{P}}\_{\Lambda}^{\beta}(\mathfrak{p}\_{\Lambda}^{\beta}) = -\mathcal{B}^{-1} \ln \hat{Z}\_{\Lambda}^{\beta}. \tag{9.99}$$

*Proof.* One proof is analogous to the classical case, in that for all ρΛ ∈ D(*B*(*H*<sup>Λ</sup> )),

$$
\mathcal{A}\_{\Lambda}^{\beta}(\mathfrak{p}\_{\Lambda}) \ge -\beta^{-1} \ln \hat{Z}\_{\Lambda}^{\beta}, \tag{9.100}
$$

with equality iff ρΛ = ρ<sup>β</sup> <sup>Λ</sup> . This, in turn, follows from the inequality

$$\operatorname{Tr}\left(a(\ln b - \ln a)\right) \le \operatorname{Tr}(b - a),\tag{9.101}$$

with equality iff *b* = *a*, which is valid for matrices *a*,*b* for which *a* ≥ 0 (in the usual sense that λ ≥ 0 for each λ ∈ σ(*a*)) and *b* > 0 in that λ > 0 for each λ ∈ σ(*b*). The case *a* = ρΛ and *b* = ρ<sup>β</sup> <sup>Λ</sup> immediately gives the claim. -

What remains to be done, however, is to define equilibrium states for infinite systems. This is achieved through the so-called KMS*-condition*, which is based on the observation that for any *a*,*b* ∈ *A*<sup>Λ</sup> , in terms of (9.40) the state (9.96) satisfies

$$a\alpha\_{\Lambda}^{\beta}(\alpha\_{t}^{(\Lambda)}(a)b) = a\alpha\_{\Lambda}^{\beta}(b\alpha\_{t+i\beta}^{(\Lambda)}(a)) \ (t \in \mathbb{R}).\tag{9.102}$$

Moreover, in finite systems this condition (even at *t* = 0) fully characterizes ω<sup>β</sup> Λ :

Proposition 9.25. *Let h be a self-adjoint operator on a finite-dimensional Hilbert space H , with associated density operator* ρ *and (complex) time-evolution given by*

$$\rho = \frac{e^{-h}}{\text{Tr}\,(e^{-h})};\tag{9.103}$$

$$\mathfrak{a}\_{\mathbb{C}}(a) = e^{i\varepsilon h} a e^{-i\varepsilon h}, \; z \in \mathbb{C}, a \in \mathcal{B}(H'), \tag{9.104}$$

*respectively (the exponentials being defined by a norm-convergent power series). Then the associated two-point functions defined by* ω(*a*) = Tr(ρ*a*) *satisfy*

$$a\mathfrak{o}(ab) = \mathfrak{o}(b\mathfrak{a}\_{\mathfrak{i}}(a)) \ (a, b \in B(H)).\tag{9.105}$$

*Conversely, any state for which* (9.105) *holds for given h and* α*<sup>z</sup> is given by* (9.103)*.*

*Proof.* Eq. (9.105) follows from (9.103) - (9.104) and cyclicity of the trace, i.e., (A.78). Similarly, given non-degeneracy of the Hilbert-Schmidt inner product (B.495) on *B*(*H*), eq. (9.105) is equivalent to the condition

$$
\mathfrak{p}a = e^{-h}ae^h \mathfrak{p},\tag{9.106}
$$

for each *a* ∈ *B*(*H* ). Multiplying with exp(*h*) shows that exp(*h*)ρ commutes with every *a* ∈ *B*(*H* ). Since *B*(*H* ) = C· 1*<sup>H</sup>* , we obatin exp(*h*)ρ = λ · 1*H*. Since exp(*h*) is invertible with inverse exp(−*h*), we obtain ρ = λ · exp(−*h*), upon which the normalization condition Tr(ρ) = 1 yields (9.103). -

For arbitrary C\*-algebras *A* with time-evolution *t* → α*t*, expressions like α*t*+*i*<sup>β</sup> (*a*) may not be defined, so one has to proceed more carefully, but the idea is the same.

Definition 9.26. *Let A be a C\*-algebra with an automorphism group* R*. A* KMS state *at "inverse temperature"* β ∈ R *is a state* ω *on A with the following property:*

*1. For any a*,*b* ∈ *A, the function Fa*,*<sup>b</sup>* : *t* → ω(*b*α*t*(*a*)) *from* R *to* C *has an analytic continuation to the strip*

$$\mathcal{AP}\_{\mathcal{B}} = \{ z \in \mathbb{C} \mid 0 \le \text{Im}(z) \le \mathcal{B} \},\tag{9.107}$$

*where it is holomorphic in the interior and continuous on the boundary*

$$
\partial \mathcal{J} \mathcal{J}\_{\mathcal{B}} = \mathbb{R} \cup (\mathbb{R} + i \mathcal{B}). \tag{9.108}
$$

*2. The boundary values of Fa*,*<sup>b</sup> are related, for all t* ∈ R*, by*

$$F\_{a,b}(t) = a \mathfrak{o}(b\mathfrak{a}(a));\tag{9.109}$$

$$F\_{a,b}(t+i\mathcal{B}) = \mathcal{o}(\mathcal{a}\_t(a)b). \tag{9.110}$$

*If this is the case,* ω *satisfies the* KMS-condition *at (inverse temperature)* β*.*

It is easy to show that *A* has a dense subset *A*<sup>α</sup> such that for any *a* ∈ *A*<sup>α</sup> the function *t* → α*t*(*a*) from R to *A* extends to an entire *A*-valued analytic function, written *z* → α*z*(*a*)(i.e., for each ϕ ∈ *A*<sup>∗</sup> the function *z*→ ϕ(α*z*(*a*))from C to C is entire analytic). Namely, for any *a* ∈ *A* and ε > 0, define

$$a\_{\mathfrak{E}} = \int\_{-\infty}^{\infty} \frac{dt}{\sqrt{2\pi\mathfrak{E}}} e^{-t^2/2\mathfrak{E}} a\_{\mathfrak{f}}(a),\tag{9.111}$$

which satisfies *a*<sup>ε</sup> ∈ *A*<sup>α</sup> and limε↓<sup>0</sup> *a*<sup>ε</sup> = *a*. If *A* = *B*(*H* ) with dim(*H* ) < ∞, we even have *B*(*H* )<sup>α</sup> = *B*(*H* ), since (9.104) is entire analytic in *z* for any *a* ∈ *B*(*H* ). For any *A*, the KMS-condition on ω is then equivalent to the simpler requirement

$$
\mathfrak{o}(ab) = \mathfrak{o}(b\mathfrak{a}\_{\beta}(a)) \ (a \in A\_{\mathfrak{a}}, b \in A). \tag{9.112}
$$

Corollary 9.27. *If A* = *B*(*H* ) *with* dim(*H* ) < ∞*, then* KMS *states (at fixed* β*) are necessarily given by the equilibrium states of Theorem 9.24 and hence are unique.*

Although initially the characterization of equilibrium states of infinite systems by the KMS condition was tentative, in the 1970s and '80s it became clear that it was spot on, being equivalent to local and global thermodynamic stability (against perturbations of the dynamics), the (local) maximum entropy principle, etc. Also:

#### Proposition 9.28. *A* KMS *state at* β ∈ R\{0} *is time-independent.*

*Proof.* We just sketch the proof if *A* is unital. Taking *b* = 1*A*, for fixed *a* ∈ *A*<sup>α</sup> the function *Fa*,1*<sup>A</sup>* ≡ *F* defined by *F*(*z*) = ω(α*z*(*a*)) is entire analytic on C. Writing *z* = *t* + *is* (with *s*,*t* ∈ R), we have α*<sup>z</sup>* = α*<sup>t</sup>* ◦ α*is* and hence (since each α*<sup>t</sup>* is an automorphism and hence an isometry), |*F*(*t* +*is*)|≤α*is*(*a*). Also, (9.112) yields *F*(*t* +*i*(*s*+β)) = *F*(*t* +*is*). Hence *F*(*t* +*is*) is bounded in *t* and periodic in *s*; by the latter property its supremum on C may be computed by its supremum on the strip S<sup>β</sup> , and by the former property this supremum is finite. Therefore, *F* is bounded, and so by Liouville's Theorem it must be constant, especially if *z* = *t* ∈ R. Hence α∗ *<sup>t</sup>* ω(*a*) = ω(*a*) for each *a* ∈ *A*α, and since this is a dense set, α<sup>∗</sup> *<sup>t</sup>* ω = ω. -

By the argument for ground states following Definition 9.16, the automorphism group *t* → α*<sup>t</sup>* is unitarily implemented in the GNS-representation πω induced by a KMS state ω, such that (9.51) - (9.52) hold. However, the operator *h*<sup>ω</sup> in this construction should not be confused with the Hamiltonian of the system. For example suppose *A* = *B*(*H* ) for some (not necessarily finite-dimensional) Hilbert space *H* , so that (9.39) holds for some (not necessarily bounded) Hamiltonian *h* with discrete spectrum, such that exp(−β*h*) ∈ *B*1(*H* ). If we now define the density operator

$$\rho = \frac{e^{-\beta h}}{\text{Tr}\left(e^{-\beta h}\right)},\tag{9.113}$$

then the corresponding state ω satisfies the KMS-condition at β. Generalizing the computations around (2.66) in §2.4, we then find (up to unitary equivalence):

$$H\_{\text{ol}} = B\_2(H');\tag{9.114}$$

$$
\pi\_{\mathfrak{w}}(a)b = ab;\tag{9.115}
$$

$$
\mathfrak{Q}\_{\mathfrak{w}} = \mathfrak{p}^{1/2};\tag{9.116}
$$

$$e^{ith\_{a0}} = \pi\_{a0} \left( e^{ith} \right) \pi\_{a0}' \left( e^{-ith} \right), \tag{9.117}$$

where for any *a* ∈ *B*(*H* ), the operator π <sup>ω</sup>(*a*) on *B*2(*H* ) is defined by

$$
\pi\_{\mathfrak{w}}'(a)b = ba.\tag{9.118}
$$

Note that (9.115) is well defined, since ρ ≥ 0 and ρ ∈ *B*1(*H* ), whence <sup>ρ</sup>1/<sup>2</sup> <sup>∈</sup> *B*2(*H* ), and hence also *ab* ∈ *B*2(*H* ) and *ba* ∈ *B*2(*H* ), since *B*2(*H* ) is a two-sided ideal in *B*(*H* ). If *h* happens to be bounded, we may therefore write

$$h\_{\rm oo} = \pi\_{\rm oo}(h) - \pi\_{\rm oo}'(h). \tag{9.119}$$

Note that the π <sup>ω</sup> term in (9.117) is not needed for (9.52), since [πω(*a*),π <sup>ω</sup>(*b*)] = 0 for any *a*,*b* ∈ *B*(*H* ), but it *is* necessary to secure (9.51). Another feature of this example is that the vector Ωω is not only *cyclic* for πω(*B*(*H* )), which it has to be by virtue of the GNS-construction, but also *separating*, i.e., πω(*a*)Ωω = 0 implies πω(*a*) = 0. In other words, one has ω(*a*∗*a*) = 0 iff *a* = 0 (which is by no means the case for ground states). If dim(*H* ) < ∞, this is obvious, because πω(*a*)Ωω = *a*ρ1/<sup>2</sup> and ρ1/<sup>2</sup> is invertible. In general, for arbitrary C\*-algebras *A* we have:

Proposition 9.29. *Let* ω *be a* KMS *state on A at* β ∈ R*. Then* Ωω *is both cyclic and separating for* πω(*A*) *and hence also for* πω(*A*) *(as well as for* πω(*A*) *).*

*Proof.* Since <sup>ω</sup>(*a*∗*a*) = πω(*a*)Ωω2, we have <sup>ω</sup>(*a*∗*a*) = 0 iff πω(*a*)Ωω <sup>=</sup> 0, so that

$$\mathcal{O}(a^\*\mathcal{a}\_\ell(a)) = \langle \pi\_{a\flat}(a)\mathcal{Q}\_{a\flat}, \pi\_{a\flat}(\mathcal{a}\_\ell(a))\mathcal{Q}\_{a\flat} \rangle = 0 \ (\iota \in \mathbb{R})^\ell$$

if ω(*a*∗*a*) = 0, and hence *Fa*∗,*a*(*t*) = 0, cf. (9.109). The "edge of the wegde" theorem then gives *Fa*∗,*a*(*z*) = 0 for all *z* ∈ S<sup>β</sup> , upon which the KMS-condition gives

$$
\rho(aa^\*) = F\_{a^\*,a}(i\beta) = 0.
$$

This means that ω(*a*∗*a*) = 0 iff ω(*aa*∗) = 0, or πω(*a*)Ωω = 0 iff πω(*a*)∗Ωω = 0, and hence πω(*b*∗)πω(*a*)Ωω = 0 iff πω(*a*∗)πω(*b*)Ωω = 0. Since Ωω is cyclic for πω(*A*), the assumption πω(*a*)Ωω = 0 therefore implies that the bounded operator πω(*a*∗) vanishes on a dense domain in *H*<sup>ω</sup> and hence vanishes. Since πω(*a*)=(πω(*a*∗))∗, it follows that πω(*a*) = 0. The extension to πω(*A*) (and πω(*A*) ) is obvious. -

Corollary 9.30. *If* ω *is a* KMS *state on a quasi-local algebra A, i.e., given by* (8.130) *with* dim(*H*) < ∞*, then* ω(*a*∗*a*) = 0 *iff a* = 0 *and hence the* GNS*-representation* πω : *A* → *B*(*H*ω) *is injective.*

*Proof.* By the previous proof, the closed left-ideal (C.204) is actually a two-sided ideal, which must be zero, since *A* is simple (as is easily shown from the simplicity of *B*(*H*) for finite-dimensional *H*, cf. §8.5). -

Proposition 9.29 shows that the von Neumann algebra πω(*A*) is in standard form (see Definition C.158), so that the KMS condition bring us into the realm of the Tomita–Takesaki theory. In particular, Theorem C.159 provides us with another time-evolution, namely the one given by the modular group. In the situation of Theorem C.159, we take *a* ∈ *M*<sup>α</sup> and *b* ∈ *M*, and compute

$$
\begin{split}
\langle \mathfrak{Q}, b\mathfrak{a}\_{-i}(a)\mathfrak{Q} \rangle &= \langle \mathfrak{Q}, b\Delta a \mathfrak{A}^{-1} \mathfrak{Q} \rangle = \langle \mathfrak{Q}, b\Delta a \mathfrak{Q} \rangle \\ &= \langle \mathfrak{A}^{1/2} b^\* \mathfrak{Q}, \mathfrak{A}^{1/2} a \mathfrak{Q} \rangle = \langle J \mathfrak{A}^{1/2} a \mathfrak{Q}, J \mathfrak{A}^{1/2} b^\* \mathfrak{Q} \rangle \\ &= \langle \mathfrak{A} a \mathfrak{Q}, \mathfrak{S} b^\* \mathfrak{Q} \rangle = \langle a^\* \mathfrak{Q}, b\mathfrak{Q} \rangle \\ &= \langle \mathfrak{Q}, ab \mathfrak{Q} \rangle,
\end{split}
\tag{9.120}
$$

where we used the property Δ1/2Ω = Ω as well as anti-unitarity of *J*, which implies *J*ψ, *J*ϕ = ϕ,ψ; these facts follow from the definitions of Δ and *J* via *S*. Therefore, the state ω on *M* defined by ω(*a*) = Ω,*a*Ω (*a* ∈ *M*) satisfies the KMScondition for the modular group at β = −1. If, on the other hand, we start with a β-KMS state ω on a C\*-algebra *A* with respect to some given time-evolution α*t*, and take *H* = *H*ω, *M* = πω(*A*), and Ω = Ωω, the normal extension of ω to πω(*A*) given by Ωω,·Ωω still satisfies the KMS condition with respect to the time-evolution on πω(*A*) given by conjugation with exp(*ith*ω), as in (9.52). Comparing the latter with the time-evolution on *M* defined by conjugation with Δ*it* (cf. Theorem C.159) gives

$$e^{itha} = \Delta^{-\dot{u}/\beta},\tag{9.122}$$

since both one-parameter groups of unitary operators satisfy the KMS-condition at β, and some time-evolution α*<sup>t</sup>* that satisfies the KMS-condition relative to a given state ω and inverse temperature β is unique. To see this (barring technicalities about unbounded operators that are easily dealt with), take β = −1 for simplicity, assume α*<sup>t</sup>* is conjugation by Δ*it* = exp(*ith*) (i.e., Δ = exp(*h*)), and rewrite (9.112) as

$$
\langle ab \vert ab \rangle = \langle b^\* \mathfrak{Q}, \Lambda a \mathfrak{Q} \rangle. \tag{9.123}
$$

This determines ϕ,Δψ between a dense set of vectors ϕ,ψ, and hence fixes Δ.

The operators *J* and Δ from the Tomita–Takesaki theory can explicitly be computed in the example (9.113); the antilinear operator *J* : *B*2(*H* ) → *B*2(*H* ) reads

$$Jb = b^\*,\tag{9.124}$$

so that the isomorphism *a* → *JaJ* between πω(*A*) = *B*(*H* ) (where *B*(*H* ) acts on *B*2(*H* ) by left multiplication) and its commutant πω(*A*) = *B*(*H* ) (which copy of *B*(*H* ) now acts on *B*2(*H* ) by right multiplication) is given by *JaJb* = *ba*. Furthermore, the (generally unbounded) linear operator Δ : *B*2(*H* ) → *B*2(*H* ) is given by

$$
\Delta b = \rho b \rho^{-1},
\tag{9.125}
$$

which strictly speaking is defined as the closure of the expression (9.125) on the domain of all *b* ∈ *B*2(*H* ) for which *<sup>b</sup>*ρ−1/<sup>2</sup> <sup>∈</sup> *<sup>B</sup>*(*<sup>H</sup>* ).

Theorem 9.31. *For given unital C\*-algebra A, dynamics* α : R → Aut(R)*, and inverse temperature* β ∈ R*, let S*<sup>β</sup> (*A*) *be the compact convex set of* KMS *states. Then*

$$
\partial\_{\epsilon} S\_{\beta}(A) = S\_{\beta}(A) \cap S\_{p}(A), \tag{9.126}
$$

*where Sp*(*A*) *is the set of primary states on A (cf. Definition 8.17). Consequently, extreme* KMS *states at fixed inverse temperature* β *are either equal or disjoint.*

This suggests that extreme KMS states define *pure thermodynamics phases*.

*Proof.* We enlarge *S*<sup>β</sup> (*A*) to the set *K*ˆ <sup>β</sup> (*A*) ⊂ *A*<sup>∗</sup> of all continuous linear functionals on *A* that satisfy the β-KMS condition (so that *S*<sup>β</sup> (*A*) consists of all positive elements in *K*ˆ <sup>β</sup> (*A*) of unit norm). The key to the proof is a bijection between the set *S*(ω) of functionals <sup>ρ</sup> <sup>∈</sup> *<sup>K</sup>*<sup>ˆ</sup> <sup>β</sup> (*A*) for which 0 ≤ ρ ≤ ω, where ω ∈ *S*<sup>β</sup> (*A*) is fixed, and the set *T*(ω) of operators *c* ∈ πω(*A*) ∩πω(*A*) such that 0 ≤ *c* ≤ 1*H*<sup>ω</sup> , given by

$$
\rho(a) = \langle \Omega\_{\mathfrak{a}\mathfrak{o}}, c\pi\_{\mathfrak{o}}(a)\Omega\_{\mathfrak{o}}\rangle. \tag{9.127}
$$

This implies the claim, since ω ∈ ∂*eS*<sup>β</sup> iff any ρ ∈ *S*(ω) takes the form ρ = *t*ω for some *t* ∈ [0,1] (cf. Lemma C.17), which in turn is the case iff *c* = *t* · 1*H*<sup>ω</sup> .

First, for any state ω ∈ *S*(*A*) there is a bijection between the set of linear functionals ρ ∈ *A*<sup>∗</sup> for which 0 ≤ ρ ≤ ω and the set of operators *c* ∈ πω(*A*) such that 0 ≤ *c* ≤ 1*H*<sup>ω</sup> , given by (9.127). Indeed, in one direction, given *a* = *b*∗*b* ≥ 0, we have

$$(a \mathfrak{o} - \mathfrak{p})(a) = \langle \pi\_{\mathfrak{o}}(b)\mathfrak{Q}\_{\mathfrak{o}}, (1\_{H\_{\mathfrak{o}}} - c)\pi\_{\mathfrak{o}}(b)\mathfrak{Q}\_{\mathfrak{o}}\rangle \ge 0,\tag{9.128}$$

for if 0 ≤ *c* ≤ 1*H*<sup>ω</sup> , then 0 ≤ (1*H*<sup>ω</sup> − *c*) ≤ 1*H*<sup>ω</sup> . Hence ρ ≤ ω, whilst from (9.127) we similarly find ρ ≥ 0. Conversely, ρ induces a quadratic form *R* on *H*ω, defined initially on the dense domain πω(*A*)*H*<sup>ω</sup> by the formula

$$R(\pi\_{a0}(a)\Omega\_{a0}, \pi\_{a0}(b)\Omega\_{a0}) = \mathfrak{p}(a^\*b),\tag{9.129}$$

which is easily seen to be well defined, positive, and bounded, and so Proposition B.79 supplies the operator *c*, which a simple computation shows to be in πω(*A*) .

For the bijection *S*(ω) ∼= *T*(ω), where ω is a β-KMS state as above, we therefore need the additional property *c* ∈ πω(*A*). Putting β = −1 for convenience and using the notation of Theorem C.159, we first show that <sup>Δ</sup>−*itc*Δ*it* <sup>=</sup> *<sup>c</sup>* for any *<sup>t</sup>* <sup>∈</sup> <sup>R</sup>: indeed, since ρ satisfies the KMS condition, it is time-translation invariant, so that

$$
\begin{split}
\langle \pi\_{\mathfrak{o}}(a^\*)\Omega\_{\mathfrak{o}},\Delta^{-\operatorname{il}}c\Delta^{\operatorname{il}}\pi\_{\mathfrak{o}}(b)\Omega\_{\mathfrak{o}}\rangle &= \langle \Omega\_{\mathfrak{o}},c\Delta^{\operatorname{il}}\pi\_{\mathfrak{o}}(a)\Delta^{-\operatorname{il}}\pi\_{\mathfrak{o}}(b)\Delta^{-\operatorname{il}}\Omega\_{\mathfrak{o}}\rangle \\&= \langle \Omega\_{\mathfrak{o}},c\pi\_{\mathfrak{o}}(\mathfrak{o}\_{\mathfrak{i}}(ab))\Omega\_{\mathfrak{o}}\rangle \\&= \mathfrak{p}(\mathfrak{o}\_{\mathfrak{i}}(ab)) = \mathfrak{p}(ab) \\&= \langle \pi\_{\mathfrak{o}}(a^\*)\Omega\_{\mathfrak{o}},c\pi\_{\mathfrak{o}}(b)\Omega\_{\mathfrak{o}}\rangle,
\end{split}
$$

so that Δ−*itc*Δ*it* = *c* between a dense set of states, and hence this is valid as an operator equation. This also implies that *c* commutes with any power of Δ. Define *c* = *JcJ*, which by Theorem C.159 is an element of πω(*A*), and compute

$$
\begin{split}
\langle \mathfrak{Q}\_{\mathfrak{w}}, \mathfrak{π}\_{\mathfrak{w}}(a)c^{\prime}\mathfrak{Q}\_{\mathfrak{w}}\rangle &= \langle \mathfrak{Q}\_{\mathfrak{w}}, \mathfrak{π}\_{\mathfrak{w}}(a)Jc\Delta^{1/2}\mathfrak{Q}\_{\mathfrak{w}}\rangle = \langle \mathfrak{Q}\_{\mathfrak{w}}, \mathfrak{π}\_{\mathfrak{w}}(a)J\Delta^{1/2}c\mathfrak{Q}\_{\mathfrak{w}}\rangle \\ &= \langle \mathfrak{Q}\_{\mathfrak{w}}, \mathfrak{π}\_{\mathfrak{w}}(a)\mathrm{Sc}\mathfrak{Q}\_{\mathfrak{w}}\rangle = \langle \mathfrak{Q}\_{\mathfrak{w}}, \mathfrak{π}\_{\mathfrak{w}}(a)c^{\*}\mathfrak{Q}\_{\mathfrak{w}}\rangle \\ &= \langle \mathfrak{Q}\_{\mathfrak{w}}, \mathfrak{π}\_{\mathfrak{w}}(a)c\mathfrak{Q}\_{\mathfrak{w}}\rangle \\ &= \mathfrak{p}(a),
\end{split}
\tag{9.130}
$$

where we used the properties *J*Ωω = Ωω, Δ1/<sup>2</sup>Ωω = Ωω, *c*Δ1/<sup>2</sup> = Δ1/2*c* as just mentioned, *<sup>S</sup>* <sup>=</sup> *<sup>J</sup>*Δ1/2, and *<sup>c</sup>*<sup>∗</sup> <sup>=</sup> *<sup>c</sup>* (since *<sup>c</sup>* <sup>≥</sup> 0). Finally, it follows from the KMS condition (applied to the normal extension of the state ω to πω(*A*) given by Ωω,·Ωω as well as to the normal extension of ρ to πω(*A*) given by Ωω,·*c* Ωω just computed) that *c* ∈ πω(*A*) , since for arbitrary *a*,*b*,*d* ∈ *A*<sup>α</sup> we have

$$\begin{aligned} \mathfrak{o}(ac'bd) &= \mathfrak{o}(\mathfrak{a}\_i(bd)ac') = \mathfrak{p}(\mathfrak{a}\_i(bd)a) = \mathfrak{p}(\mathfrak{a}\_i(b)\mathfrak{a}\_i(d)a) \\ &= \mathfrak{p}(\mathfrak{a}\_i(d)ab) = \mathfrak{o}(\mathfrak{a}\_i(d)abc') = \mathfrak{o}(abc'd). \end{aligned}$$

In other words, for any *a*,*b*,*d* ∈ *A* we have

$$
\langle \pi\_{\mathfrak{o}}(a^\*)\Omega\_{\mathfrak{o}}, c^\prime \pi\_{\mathfrak{o}}(b)\pi\_{\mathfrak{o}}(d)\Omega\_{\mathfrak{o}} \rangle = \langle \pi\_{\mathfrak{o}}(a^\*)\Omega\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(b)\pi\_{\mathfrak{o}}(d)c^\prime \Omega\_{\mathfrak{o}} \rangle,\qquad(9.131)
$$

so that *c* πω(*b*) = πω(*b*)*c* between vectors in a dense domain, so that this is an operator equality. Hence *c* ∈ πω(*A*) , and in view of this we may rewrite (9.130) as ρ(*a*) = Ωω, *c* πω(*a*)Ωω. Since the operator *c* ∈ πω(*A*) in (9.127) is uniquely determined by ρ, this shows that *c* = *c*. Since we already had *c* ∈ πω(*A*), it follows that *c* ∈ πω(*A*) ∩πω(*A*). -

It can also be shown that *S*<sup>β</sup> (*A*) is a (Choquet) *simplex*, which is a property rather more typical of the state space of a *commutative* unital C\*-algebra; this makes it especially remarkable for the set of β-KMS states on a highly *non-commutative* C\* algebra like the infinite tensor product of *B* = *Mn*(C). In the physically relevant case where *S*<sup>β</sup> (*A*) is metrizable, this implies that for any given KMS state ω ∈ *S*<sup>β</sup> (*A*) there is a *unique* probability measure μ on ∂*eS*<sup>β</sup> (*A*), such that for each *a* ∈ *A*,

$$\mathfrak{so}(a) = \int\_{\partial S\_{\beta}(A)} d\mu(a') \,\mathfrak{so}'(a). \tag{9.132}$$

Conversely, any probability measure μ on ∂*eS*<sup>β</sup> (*A*) *defines* a β-KMS state by reading this equality from right to left. Towards the next chapter, suppose for example that there is a *G*-action on *A*, i.e., a continuous homomorphism γ : *G* → Aut(*A*) (where *G* is a locally compact group). Then *G* also acts on *S*(*A*) via the dual maps γ∗ *<sup>g</sup>* (ω) = ω ◦ γ*g*−<sup>1</sup> , and if *G* is a symmetry of the dynamics in that α*<sup>t</sup>* ◦ γ*<sup>g</sup>* = γ*<sup>g</sup>* ◦ α*<sup>t</sup>* for each *t* ∈ R and *g* ∈ *G*, then this dual action maps both *S*<sup>β</sup> (*A*) and ∂*eS*<sup>β</sup> (*A*) into themselves. If *G* is compact with normalized Haar measure μ, then for any fixed extremal KMS state ω<sup>0</sup> ∈ ∂*eS*<sup>β</sup> (*A*), by (left) invariance of μ one obtains a *G*-invariant state by

$$
\mathfrak{o} = \int\_G d\mu \,(\mathfrak{g}) \, \chi\_{\mathfrak{g}}^\* \mathfrak{o}\_0. \tag{9.133}
$$

#### Notes

## §9.1. Symmetries of C\*-algebras and Hamhalter's Theorem

Theorem 9.4 is due to Hamhalter (2011). Our proof, taken almost *verbatim* from Landsman & Lindenhovius (2016) roughly follow his, but adds various details and also takes some different turns. The main differences with the original proof by Hamhalter are the following. Firstly, we give an order-theoretic characterization of u.s.c. decompositions of the form π*<sup>K</sup>* (and hence of the commutative algebras in C (*C*(*X*)) that are the unitization of some ideal) by the three axioms stated in Lemma 3.1.1 in Firby (1973), whereas Hamhalter uses Proposition 7 in Mendivil (1999), which gives a different characterization of unitizations of ideals. Furthermore, Hamhalter only treats Lemma 9.5 in full generality, whereas in our opinion it is very instructive to take the case of finite sets first, where many of the key ideas already appear in a setting where they are not overshadowed by topological complications. Finally, our proof of Lemma 9.6.2 differs from Hamhalter's proof. The topology of partitions may be found in Willard (1970), especially Theorem 9.9.

Theorem 9.7 is due to Hamhalter (2015). Corollary 9.9 has a long history, starting with Jacobson & Rickart (1950) and ending with Thomsen (1982).

## §9.2. Unitary implementability of symmetries

See Bratteli & Robinson (1987), §4.3.

## §9.3. Motion in space and in time

For a far more detailed study of asymptotic abelianness see Bratteli & Robinson (1987), §4.3.2 and Bratteli & Robinson (1997), §5.4.1. Results like Theorem 9.14 may also be found in Sewell (2002). Theorem 9.14 is also valid for *ergodic states* with respect to the given Z*d*-action, where we say that a state on a C\*-algebra *A* with *G*-action is ergodic if it is an element of ∂*e*(*S*(*A*)*G*), i.e., extreme in the convex set of *G*-invariant states on *A*. Also Theorem 9.15 holds (with a more complicated proof, of course) under weaker conditions on Φ, typically exponential decay in *X*.

Theorem 9.15 is the simplest result in this direction; for similar results under weaker assumptions on the interaction Φ, see Bratteli & Robinson (1997), §6.2.1.

## §9.4. Ground states of quantum systems

The idea of a ground state of a quantum system may be attributed to Bohr (1913), who postulated that an atom has a state of lowest energy (which he called a "permanent state"). See e.g. Pais (1986), p. 199. In this section, which merely present some key points treated in far more detail in Bratteli & Robinson (1997), §5.3.3. and §6.2.7, we have just scratched the surface of the topic, which is basic to physics.

#### §9.5. Ground states and equilibrium states of classical spin systems

Basic references for the mathematical physics of classical spin systems on a lattice are Israel (1979), Simon (1993), van Enter, Fernandez, & Sokal (1993), and Georgii (2011). One may now define pure thermodynamics phases as extreme elements of the compact convex set of all Gibbs measures (or of the set of all translation-invariant Gibbs measures, as in Simon, 1993, §III.5), but there is no identification between pure thermodynamics phases with primary equilibrium states (as in the quantum case), because a state on a commutative C\*-algebra like *C*(*n*Z*<sup>d</sup>* ) is primary iff it is pure. Fortunately, the specific measure-theoretic setting of classical statistical mechanics provides its own resources. For any <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*d*, let ΣΛ be the smallest σ-algebra (within the Borel σ-algebra for *n*Z*<sup>d</sup>* ) for which each *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*n*<sup>Λ</sup> ) is measurable, and let

$$
\Sigma\_{\infty} = \bigcap\_{\Lambda} \Sigma\_{\Lambda},\tag{9.134}
$$

where each Λ is *finite*, be the σ*-algebra at infinity*, with associated commutative C\*-algebra B∞(*n*Z*<sup>d</sup>* ) of all bounded measurable functions on *n*Z*<sup>d</sup>* that are Σ∞ measurable. This is the home of the macroscopic observables, defined as averages analogously to the quantum case. The role of primary states (or rather of states whose algebra of observables is trivial at infinity, as in Theorem 8.23) is now played by *states that are trivial at infinity*, that is, probability measures μ on *n*Z*<sup>d</sup>* for which either μ(*X*) = 0 or μ(*X*) = 1 for *X* ∈ Σ<sup>∞</sup> (cf. the Kolmogorov 0-1 law of probability theory). Indeed there is a classical version of Theorem 8.23, making exactly the same claim *mutatis mutandis*, see Theorem III.1.6 in Simon (1993). The main result (cf. Theorem 7.7 in Georgii, 2011), is that a state is extreme in the compact convex set of all Gibbs measures (at fixed temperature and potential, of course) iff it is a Gibbs measure that is trivial at infinity. It follows that two distinct extreme Gibbs measures are mutually singular on Σ<sup>∞</sup> (which is the pertinent classical version of disjointness of primary states).

#### §9.6. Equilibrium (KMS) states of quantum systems

The KMS condition was introduced by Haag, Hugenholtz, and Winnink (1967), in the following equivalent form:

$$\int\_{-\infty}^{\infty} dt \, f(t - i\beta) \, \mathfrak{a}(a\mathfrak{a}\_t(b)) = \int\_{-\infty}^{\infty} dt \, f(t) \, \mathfrak{a}(\mathfrak{a}\_t(b)a),\tag{9.135}$$

for each *a*,*b* ∈ *A* and each Schwartz function *f* ∈ D(R). The name KMS derives from the earlier observation (9.102) of Kubo (1957) and independently Martin & Schwinger (1957). See also Haag (1992), Simon (1993), Borchers (2000), Sewell (2002), Thirring (2002), Emch (2007), and perhaps also, at a heuristic level, Landsman & van Weert (1987), especially for applications of the KMS condition to quantum field theory at finite temperature and the quark-gluon plasma (this, incidentally, was the MSc thesis as well as the first major published paper by the author).

The KMS condition also plays a major role in operator algebras and noncommutative geometry; see Connes (1994) and Connes & Marcolli (2008).

For a proof of (9.101) see Bratteli & Robinson (1997, Lemma 6.2.21); this book is the bible about the KMS condition and its application to quantum spin systems.

The proof of Proposition 9.25 is taken from Simon (1993), Lemma IV.4.1 and Proposition IV.4.2. The terminology of pure thermodynamical phases for primary KMS states (introduced after Theorem 9.31) is not completely standard; also ergodic states are sometimes called 'pure phases'.

## Chapter 10 Spontaneous Symmetry Breaking

As we shall see, the undeniable natural phenomenon of *spontaneous symmetry breaking* (SSB) seems to indicate a serious mismatch between theory and reality. This mismatch is well expressed by what is sometimes called *Earman's Principle*:

'While idealizations are useful and, perhaps, even essential to progress in physics, a sound principle of interpretation would seem to be that no effect can be counted as a genuine physical effect if it disappears when the idealizations are removed.' (Earman, 2004, p. 191)

To describe the various examples apparently violating Earman's Principle (and hence the link between theory and reality) in a general way (so general even that it will encapsulate the measurement problem), it is convenient to install a definition:

#### Definition 10.1. Asymptotic emergence *is the conjunction of three conditions:*


In connection with SSB (as item 3.) we will look at the following pairs (H,L):

	- L is quantum mechanics (on the pertinent Hilbert space *L*2(R));
	- The limiting relationship between the two theories is as described in §7.1 (notably by the continuous bundle of C\*-algebras (7.17) - (7.19) for *n* = 1).
	- L is statistical mechanics of a quantum spin system on a *finite* lattice;
	- Their limiting relationship is as described in §8.6 (cf. Theorem 8.4).
	- L is statistical mechanics of a quantum spin system on a *finite* lattice;
	- The limiting relationship between H and L is given in §8.6 (cf. Theorem 8.8).

Of course, there are many other interesting example of (apparent) asymptotic emergence not treated in this book, such as geometric optics (as H) versus wave optics (as L), where the new feature of H would be the *absence* of interference of light rays—foreshadowing the measurement problem of quantum mechanics!— or hydrodynamics (as H) versus molecular dynamics (as L), where the new feature is irreversibility. Perhaps space-time asymptotically emerges from quantum gravity.

The "unexplained" features of H mentioned in the third part of Definition 10.1 are often called *emergent*, although this term has to be used with great care. Its meaning here reflects the original use of the term by the so-called "British Emergentists" (whose pioneer was J.S. Mill), as expressed in 1925 by C.D. Broad:

'The characteristic behaviour of the whole *could* not, even in theory, be deduced from the most complete knowledge of the behaviour of its components, taken separately or in other combinations, and of their proportions and arrangements in this whole. This is what I understand by the 'Theory of Emergence'. I cannot give a conclusive example of it, since it is a matter of controversy whether it actually applies to anything.' (Broad, 1925, p. 59)

In quotations like these, the notion "emergence" is meant to be the very opposite of the idea of "reduction" (or "mechanicism", as Broad called it); in fact, for many authors this opposition seems to be the principal attraction of emergence. In principle, two rather different notions of reduction then lead (contrapositively) to two different kinds of emergence, which are sometimes mixed up but should be distinguished:


In older literature concerned with the reduction of biology to chemistry (challenged by Mill) and of chemistry to physics (still contested by Broad), the first notion also referred to wholes consisting of a *small* number of particles. That notion of emergence seems a lost cause, since, as noted by Hempel,

'the properties of hydrogen include that of forming, if suitably combined with oxygen, a compound which is liquid, transparent, etc.' (Hempel, 1965, p. 260)

A similar comment applies to e.g. the tertiary structure of proteins, but also to cases of emergence such as ant hills, slime mold, and even large cities (Johnson, 2001), all of which are actually fascinating success stories for *reductionism*.

More recently, the apparent possibility that *very large* assemblies of parts might give rise to emergent properties of the corresponding wholes has become increasingly popular, both in physics and in the philosophy of mind (where consciousness has been proposed as an emergent property of the brain). In physics, the modern discussion on emergence in physics was initiated by P.W. Anderson, who in a famous essay from 1972 called 'More is different' emphasized the possibility of emergence in very large systems (surprisingly, Anderson actually avoids the term 'emergence', instead speaking of 'new laws' and 'a whole new conceptual structure'). In particular, Anderson claimed SSB to be an example (if not *the* example) of emergence, duly adding that one really had to take the *N* → ∞ limit. Thus at least in physics, the interesting case for emergence in the first (i.e. whole-part) sense arises if the 'whole' is strictly infinite, as in the thermodynamic limit of quantum statistical mechanics. This example confirms that 1. and 2. often go together, but they do not always do: the classical limit of quantum mechanics is a case of pure theory reduction.

A clear description of emergence has also been given by Jaegwon Kim:


Similarly, Silberstein (2002) states (paraphrased) that a higher-level theory H:

'bears predictive/explanatory emergence with respect to some lower-level theory L if L cannot *replace* H, if H cannot be *derived* from L [i.e., L cannot reductively explain H], or if L cannot be shown to be *isomorphic* to H.'

A key point here is Kim's no. 1: not even "emergentists" deny that the whole consists of its parts, or, in asymptotic emergence, that the higher-level theory H in fact originates from the lower-level theory L. The essence of emergence, then, would be that H nonetheless has "acquired" properties *not* reducible to L. One possibility for this to happen could be that the (allegedly) emergent property of H refers to some concept that does not even make sense in L, such as the experience of pain, which is hard to make sense of at a neural level, but another possibility, which is indeed the one relevant to physics and especially to SSB, is that some particular concept possessed by H (such as SSB) is admittedly *defined* within L, but *banned*.

In describing the relationship between H and L we have to be clear about the difference between *approximations* and *idealizations*. Following Norton (2012):


Thus idealizations also provide approximations, but as systems they stand on their own and are defined independently of the target system. In our cases, the target system is a real physical system such as a ferromagnet or a quantum particle, which is supposed to be described exactly by theory L, i.e., the lower-level theory. In fact, L is a family of theories parametrized by 1/*N* (*N* ∈ N) or *h*¯ ∈ (0,1], and our real material relates to some very small value of this parameter (which may also be seen as a certain regime of L, seen as a single, unparametrized theory).

The pertinent theory H is an idealization in the above sense, through which one approximates very large systems by infinite ones and highly semi-classical ones (where *h*¯ is very small) by classical ones (where *h*¯ = 0). It is in this setting that asymptotic emergence would violate Earman's Principle and hence would blast the relationship between theory and reality: the abstract point (made concrete for SSB earlier on) is that if some real property of a real system is described by H but is not approximated in any sense by L in any regime (as is the threat with SSB), although H is supposed to be a limit of L, then the latter theory L fails to describe the real system it is supposed to describe, whereas this systems *is* described by the theory H, which portrays fictitious systems. This marks a difference with other cases of emergence, where H (including some "whole") is not an idealization but a real system itself (as might be the case with consciousness and other examples from neuroscience and the philosophy of mind). Thus our discussion does not apply to such cases.

The tension between SSB and Earman's Principle has not quite gone unnoticed in the philosophy of physics literature. For example, Liu and Emch (2005) first write that it is a mistake to regard idealizations as acts of '*neglecting the negligible*' (p. 155, which already appears to deny Earman's Principle), and continue by:

"The broken symmetry in question is *not reducible* to the configurations of the microscopic parts of any *finite* systems; but it should *supervene* on them in the sense that for any two systems that have the exactly (sic) duplicates of parts and configurations, both will have the same spontaneous symmetry breaking in them because both will behave identically in the limit. In other words, the result of the macroscopic limit is determined by the non-relational properties of parts of the finite system in question." (Liu & Emch, 2005, p. 156)

It is not easy to make sense of this, but the authors genuinely seem to believe in asymptotic emergence and hence they (again) appear to deny Earman's Principle. Another suggestion, made by Ruetsche, is to modify Earman's Principle to:

'No effect predicted by a non-final theory can be counted as a genuine physical effect if it disappears from that theory's successors.' (Ruetsche, 2011, p. 336)

For example, the theory L explaining SSB should not be quantum statistical mechanics but quantum field theory (which has an infinite number of ultraviolet degrees of freedom even in finite volume, and hence in principle allows SSB). This does make sense within physics, but, as Ruetsche herself notices, her principle 'has the pragmatic shortcoming that we can't apply it until we know what (all) successors to our present theories are.' With due respect, we will describe a rather different way out, based on unexpectedly implementing *Butterfield's Principle*, which is a corollary to Earman's Principle that removes the reduction-emergence opposition:

'there is a weaker, yet still vivid, novel and robust behaviour that occurs before we get to the limit, i.e. for finite *N*. And it is this weaker behaviour which is physically real.' (Butterfield, 2011, p. 1065)

To do so, we now turn our attention to specific (classes of) models of SSB.

#### 10.1 Spontaneous symmetry breaking: The double well

The simplest example of SSB is undoubtedly the equation *<sup>x</sup>*<sup>2</sup> <sup>=</sup> 1 (where *<sup>x</sup>* <sup>∈</sup> <sup>C</sup>), which is invariant under a Z<sup>2</sup> symmetry given by *x* → −*x*. Its solutions *x* = ±1, then, do not share this symmetry; instead Z<sup>2</sup> acts nontrivially on the solution space.

Another example that is simple at least compared to quantum spin systems is provided by elementary quantum mechanics. Thus we are now in the context of the first of the three pairs (H,L) listed in the preamble to this chapter, where, in detail:


At the level of states, the passage to the classical limit *h*¯ → 0 of any *h*¯-dependent wave-function <sup>ψ</sup>*h*¯ <sup>∈</sup> *<sup>L</sup>*2(R), if it exists, is described via the associated probability measure μψ*<sup>h</sup>*¯ on R2, which is defined by (7.31); in other words,

$$
\mu\_{\Psi\_{\hbar}}(\Delta) = \int\_{\Delta} \frac{d^n p d^n q}{2\pi \hbar} |\langle \phi\_{\hbar}^{(p,q)}, \Psi\_{\hbar} \rangle|^2 \ (\Delta \subset \mathbb{R}^{2n}), \tag{10.1}
$$

where the (Schrodinger) ¨ *coherent states* φ(*p*,*q*) *<sup>h</sup>*¯ <sup>∈</sup> *<sup>L</sup>*2(R) are given by (7.27), i.e.,

$$\phi\_{\hbar}^{(p,q)}(\mathbf{x}) = (\pi\hbar)^{-n/4} e^{-ipq/2\hbar} e^{ip\mathbf{x}/\hbar} e^{-(\mathbf{x}-q)^2/2\hbar}.\tag{10.2}$$

In terms of the associated vector states ωψ*<sup>h</sup>*¯ on the C\*-algebra *B*0(*L*2(R)), one has

$$d\mathfrak{o}\_{\mathfrak{h}}(\mathcal{Q}\_{\hbar}^{\mathcal{B}}(f)) = \langle \Psi\_{\hbar}, \mathcal{Q}\_{\hbar}^{\mathcal{B}}(f)\Psi\_{\hbar} \rangle = \int\_{\mathbb{R}^{2n}} d\mu\_{\Psi}(p, q) \, f(p, q), \tag{10.3}$$

where *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(R2). We then say that the wave-functions <sup>ψ</sup>*h*¯ have a classical limit if

$$\lim\_{\hbar \to 0} \int\_{\mathbb{R}^{2n}} d\mu\_{\Psi} f = \int\_{\mathbb{R}^{2n}} d\mu\_0 f,\tag{10.4}$$

for any *<sup>f</sup>* <sup>∈</sup>*C*0(R2), where <sup>μ</sup><sup>0</sup> is some probability measure on <sup>R</sup>2. Seen as a state <sup>ω</sup><sup>0</sup> on the classical C\*-algebra of observables *C*0(R2), the probability measure μ<sup>0</sup> is regarded as the classical limit of the family ωψ*<sup>h</sup>*¯ of states on the C\*-algebra *B*0(*L*2(R)) of quantum-mechanical observables. This family is continuous in the sense that the function *h*¯ → ωψ*<sup>h</sup>*¯(σ(*h*¯)) from [0,1] to C is continuous for every continuous crosssection σ of the given bundle of C\*-algebras. An example of such a continuous cross-section is σ(0) = *f* and σ(*h*¯) = *QB <sup>h</sup>*¯ (*f*), for any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(R2)), cf. (C.550) - (C.551), and indeed this example reproduces (10.4), which after all is just

$$\lim\_{\hbar \to 0} a\_{\Psi\_{\hbar}}(Q\_{\hbar}^{\mathcal{B}}(f)) = a\_{\emptyset}(f) \ (f \in C\_0(\mathbb{R}^2)). \tag{10.5}$$

First, let us illustrate this formalism for the ground state of the one-dimensional harmonic oscillator. Taking *m* = 1/2 and *V*(*x*) = <sup>1</sup> <sup>2</sup>ω2*x*<sup>2</sup> in the usual Hamiltonian

$$h\_{\hbar} = -\hbar^2 \frac{d^2}{d\mathbf{x}^2} + V(\mathbf{x}),\tag{10.6}$$

it is well known that the ground state is unique and that its wave-function, i.e.,

$$
\Psi \varphi\_{\hbar}(\mathbf{x}) = \left(\frac{\mathbf{o}\mathbf{o}}{2\pi\hbar}\right)^{1/4} e^{-a\mathbf{a}\mathbf{x}^2/4\hbar},\tag{10.7}
$$

is a Gaussian, peaked above *x* = 0. As *h*¯ → 0, this ground state has a classical limit, namely the Dirac measure μ<sup>0</sup> concentrated at the origin (*p* = 0,*q* = 0), i.e.,

$$\lim\_{\hbar \to 0} \int\_{\mathbb{R}^{2n}} d\mu\_{\Psi h} f = f(0,0) \ (f \in C\_0(\mathbb{R}^2)). \tag{10.8}$$

This is just the unique ground state of the corresponding classical Hamiltonian

$$h\_0(p,q) = p^2 + V(q),\tag{10.9}$$

seen as a point in the phase space R<sup>2</sup> minimizing *h*0, reinterpreted as a probability measure on phase space as explained in the context of Theorem 3.3. Note that we kept the mass fixed at *m* = 1/2, but instead we could have kept *h*¯ fixed and take the limit *m* → ∞ instead of *h*¯ → 0; cf. the preamble to Chapter 7.

The same features hold for the *an*harmonic oscillator (with small λ > 0), i.e.,

$$V(\mathbf{x}) = \frac{1}{2}\mathbf{o}\mathbf{o}^2\mathbf{x}^2 + \frac{1}{4}\lambda\mathbf{x}^4. \tag{10.10}$$

However, a new situation arises for the symmetric double-well potential

$$V(\mathbf{x}) = -\frac{1}{2}a\mathbf{o}^2\mathbf{x}^2 + \frac{1}{4}\lambda\mathbf{x}^4 + \frac{1}{4}a\mathbf{o}^4/\lambda = \frac{1}{4}\lambda(\mathbf{x}^2 - a^2)^2,\tag{10.11}$$

where *a* = ω/ √ λ > 0 (assuming ω > 0 as well as λ > 0). This time, the ground state of the classical Hamiltonian is doubly degenerate, being given by the points (*<sup>p</sup>* <sup>=</sup> <sup>0</sup>,*<sup>q</sup>* <sup>=</sup> <sup>±</sup>*a*) <sup>∈</sup> <sup>R</sup>2, with ensuing Dirac measures <sup>μ</sup><sup>±</sup> <sup>0</sup> given by

$$\int\_{\mathbb{R}^{2n}} d\mu\_0^{\pm} \, f = f(0, \pm a). \tag{10.12}$$

But it is a deep and counterintuitive fact of quantum theory that the corresponding quantum Hamiltonian (10.6) with (10.11) has a unique ground state. Indeed:

Theorem 10.2. *Let V* <sup>∈</sup> *<sup>L</sup>*<sup>2</sup> loc(R*m*) *be positive and suppose that* lim|*x*|→<sup>∞</sup> *<sup>V</sup>*(*x*) = <sup>∞</sup>*. Then* −Δ +*V has a nondegenerate (and strictly positive) ground state.*

Roughly speaking, the proof is based on an infinite-dimensional version of the Perron–Frobenius Theorem in linear algebra (applied to exp(−*thh*¯) rather than to the Hamiltonian *hh*¯ itself, so that the largest eigenvalue of the former corresponds to the smallest eigenvalue of the latter, i.e., the energy of the ground state).

And yet there are two quantum-mechanical shadows of the classical degeneracy:


In what follows, we will be especially interested in the first excited state ψ(1) *<sup>h</sup>*¯ , which like ψ(0) *<sup>h</sup>*¯ is real, but has one peak *above x* = *a* and another peak *below x* = −*a*. See Figure 10.1. The eigenvalue splitting (or "gap") vanishes exponentially in −1/*h*¯ like

$$\Delta\_{\hbar} \equiv E\_{\hbar}^{(1)} - E\_{\hbar}^{(0)} \sim (\hbar \mathfrak{o} / \sqrt{\frac{1}{2} e \mathfrak{A}}) \cdot e^{-d\_V/\hbar} \ (\hbar \to 0),\tag{10.13}$$

where the typical WKB-factor is given by

$$d\_V = \int\_{-a}^{a} d\mathbf{x} \sqrt{V(\mathbf{x})}.\tag{10.14}$$

Also, the probability density of each of the wave-functions ψ(0) *<sup>h</sup>*¯ or <sup>ψ</sup>(1) *<sup>h</sup>*¯ contains approximate δ-function peaks above *both* classical minima ±*a*. See Figure 10.2, displayed just for ψ(0) *<sup>h</sup>*¯ , the other being analogous. We can make the correspondence between the *nondegenerate* pair (ψ(0) *<sup>h</sup>*¯ , <sup>ψ</sup>(1) *<sup>h</sup>*¯ ) of low-lying quantum-mechanical wavefunctions and the pair (μ<sup>+</sup> <sup>0</sup> ,μ<sup>−</sup> <sup>0</sup> ) of *degenerate* classical ground states more transparent by invoking the above notion of a classical limit of states. Indeed, in terms of the corresponding algebraic states ω ψ(0) *h*¯ and ω ψ(1) *h*¯ , one has

$$\lim\_{\hbar \to 0} \Psi\_{\hbar}^{(0)} = \lim\_{\hbar \to 0} \Psi\_{\hbar}^{(1)} = \mu\_0^{(0)},\tag{10.15}$$

$$
\mu\_0^{(0)} \equiv \frac{1}{2} (\mu\_0^+ + \mu\_0^-), \tag{10.16}
$$

where μ± <sup>0</sup> are the pure classical ground states (10.12) of the double-well Hamiltonian. To see this, one may consider numerically computed Husimi functions, as shown in Figure 10.3 (just for ψ(0) *<sup>h</sup>*¯ , as before). From this, it is clear that the *pure* (algebraic) quantum ground state ψ(0) *<sup>h</sup>*¯ converges to the *mixed* classical state (10.16).

In contrast, the localized (but now time-dependent) wave-functions

$$
\Psi\_{\hbar}^{\pm} = \frac{\Psi\_{\hbar}^{(0)} \pm \Psi\_{\hbar}^{(1)}}{\sqrt{2}},\tag{10.17}
$$

which of course define pure states as well, converge to *pure* classical states, i.e.,

$$\lim\_{\hbar \to 0} \psi\_{\hbar}^{\pm} = \mu\_0^{\pm}. \tag{10.18}$$

In conclusion, one has SSB in H, but at first sight the underlying theory L seems to forbid it. Yet we will now show that (10.17) - (10.18), will save Earman's Principle.

Fig. 10.1 Double-well potential with ground state ψ(0) *<sup>h</sup>*¯=0.<sup>5</sup> and first excited state <sup>ψ</sup>(1) *<sup>h</sup>*¯=0.5.

Fig. 10.2 Probability densities for ψ(0) *<sup>h</sup>*¯=0.<sup>5</sup> (left) and <sup>ψ</sup>(0) *<sup>h</sup>*¯=0.<sup>01</sup> (right).

Fig. 10.3 Husimi functions for ψ(0) *<sup>h</sup>*¯=0.<sup>5</sup> (left) and <sup>ψ</sup>(0) *<sup>h</sup>*¯=0.<sup>01</sup> (right).

#### 10.2 Spontaneous symmetry breaking: The flea

Regarding the doubly-peaked ground state ψ(0) *<sup>h</sup>*¯ of the symmetric double well as the quantum-mechanical counterpart of a hung parliament, the analogue of a small party that decides which coalition is formed is a tiny *asymmetric* perturbation δ*V* of the potential. Indeed, the following spectacular phenomenon in the theory of Schrodinger operators was discovered in 1981 by Jona-Lasinio, Martinelli and ¨ Scoppola. In view of the extensive (and very complicated) ensuing mathematical literature, we just take it as our goal to explain the main idea in a heuristic way.

Replace *V* in (10.6) by *V* +δ*V*, where δ*V* (i.e., the "flea") is assumed to:


$$d\_V(\mathbf{y}, \mathbf{z}) = \left| \int\_{\mathbf{y}}^{\mathbf{z}} d\mathbf{x} \sqrt{V(\mathbf{x})} \right|;\tag{10.19}$$

$$d\_V(\mathbf{y}, A) = \inf \{ d\_V(\mathbf{y}, z), z \in A \}. \tag{10.20}$$

Second, we introduce the symbols

$$d\_V' = 2 \cdot \min\{d\_V(-a, \text{supp } \delta V), d\_V(a, \text{supp } \delta V)\};\tag{10.21}$$

$$d\_V^{\prime\prime} = 2 \cdot \max\{d\_V(-a, \text{supp } \delta V), d\_V(a, \text{supp } \delta V)\}. \tag{10.22}$$

The localization assumption on δ*V* is that one of the following conditions holds:

$$d\_V^\prime < d\_V < d\_V^\prime;\tag{10.23}$$

$$d\_V' < d\_V'' < d\_V \,. \tag{10.24}$$

In the first case, the perturbation is typically localized either on the left or on the right edge of the double well, whereas in the second it resides on the middle bump (symmetric perturbations are excluded by 3, as these would satisfy *d <sup>V</sup>* = *d V* ).

Under these assumptions, the ground state wave-function ψ(δ) *<sup>h</sup>*¯ of the perturbed Hamiltonian (which had two peaks for δ*V* = 0!) localizes as *h*¯ → 0, in a direction which *given that localization happens* may be understood from energetic considerations. For example, if δ*V* is positive and is localized to the right, then the relative energy in the left-hand part of the double well is lowered, so that localization will be to the left. See Figures 10.4 - 10.6. Eqs. (10.17) - (10.18) then yield Butterfield's Principle (with *N* 1/*h*¯), so that also Earman's Principle is saved: the essence of the argument is that (at least in the presence of a flea-perturbation) SSB is already foreshadowed in quantum mechanics *for small yet positive h*¯, if only approximately.

Fig. 10.4 Flea perturbation of ground state ψ(δ) *<sup>h</sup>*¯=0.<sup>5</sup> with corresponding Husimi function. For such relative large values of *h*¯, little (but some) localization takes place.

Fig. 10.5 Same at *h*¯ = 0.01. For such small values of *h*¯, localization is almost total.

Fig. 10.6 First excited state for *h*¯ = 0.01. Note the opposite localization area.

In more detail, for the perturbed ground state we have (subject to assumptions 1–3):

$$\frac{\Psi\_{\hbar}^{(\delta)}(a)}{\Psi\_{\hbar}^{(\delta)}(-a)} \sim e^{\mp d\_{V}/\hbar} \ (\pm \delta V > 0, \text{ supp}(V) \subset \mathbb{R}^{+});\tag{10.25}$$

$$\frac{\Psi\_{\hbar}^{(\delta)}(a)}{\Psi\_{\hbar}^{(\delta)}(-a)} \sim e^{\pm d\_{V}/\hbar} \ (\pm \delta V > 0, \text{ supp}(V) \subset \mathbb{R}^{-}),\tag{10.26}$$

with the opposite localization for the perturbed first excited state (so as to remain orthogonal to the ground state). A more precise version of the energetics used above is as follows. The ground state tries to minimize its energy according to the rules:


In any case, these results only depend on the support of δ*V*, but not on its size: this means that the tiniest of perturbations may cause collapse in the classical limit.

Although the collapse of the perturbed ground state for small *h*¯ is a mathematical theorem, it remains enigmatic. Indeed, despite the fact that in quantum theory the localizing effect of the flea is enhanced for small *h*¯, the corresponding classical system has no analogue of it. Trivially, a classical particle residing at one of the two minima of the double well at zero (or small) velocity, i.e., in one of its degenerate ground states, will not even notice the flea; the ground states are unchanged. But even under a stochastic perturbation, which leads to a nonzero probability for the particle to be driven from one ground state to the other in finite time (as some form of classical "tunneling", where in this case the necessary fluctuations come from Brownian motion), the flea plays a negligible role. For example, in the case at hand the standard *Eyring–Kramers formula* for the mean transition time reads

$$\langle \pi \rangle \cong \frac{2\pi}{\sqrt{V''(a)V''(0)}} e^{V(0)/\varepsilon},\tag{10.27}$$

where <sup>ε</sup> is the parameter in the Langevin equation *dxt* <sup>=</sup> <sup>−</sup>∇*V*(*xt*)*dt* <sup>+</sup> <sup>√</sup>2ε*dWt*, in which *Wt* is standard Brownian motion. Clearly, this expression only contains the height of the potential at its maximum and its curvature at its critical points; most perturbations satisfying assumptions 1–3 above do not affect these quantities.

The instability of the ground state of the double-well potential under "flea" perturbations as *h*¯ → 0 is easy to understand (at least heuristically) if one truncates the infinite-dimensional Hilbert space *L*2(R) to a two-level system. This simplification is accomplished by keeping only the lowest energy states ψ(0) *<sup>h</sup>*¯ and <sup>ψ</sup>(1) *<sup>h</sup>*¯ , in which case the full Hamiltonian (10.6) with (10.11) is reduced to the 2×2 matrix

$$H\_0 = \frac{1}{2} \begin{pmatrix} 0 & -\Delta \\ -\Delta & 0 \end{pmatrix},\tag{10.28}$$

with Δ > 0 given by (10.13). Dropping *h*¯, the eigenstates of *H*<sup>0</sup> are given by

$$
\mathfrak{g}\_0^{(0)} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}, \ \mathfrak{g}\_0^{(1)} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ -1 \end{pmatrix}, \tag{10.29}
$$

with energies *E*<sup>0</sup> = −<sup>1</sup> <sup>2</sup>Δ and *E*<sup>1</sup> = <sup>1</sup> <sup>2</sup>Δ, respectively; in particular, *E*<sup>1</sup> −*E*<sup>0</sup> = Δ. If

$$
\mathfrak{p}\_0^{\pm} = \frac{\mathfrak{p}\_0^{(0)} \pm \mathfrak{p}\_0^{(1)}}{\sqrt{2}},\tag{10.30}
$$

as in (10.17), then

$$
\mathfrak{q}\_0^+ = \begin{pmatrix} 0 \\ 1 \end{pmatrix}, \ \mathfrak{q}\_0^- = \begin{pmatrix} 1 \\ 0 \end{pmatrix}. \tag{10.31}
$$

Hence in this approximation ϕ<sup>+</sup> <sup>0</sup> and ϕ<sup>−</sup> <sup>0</sup> play the role of wave-functions (10.17) localized above the classical minima *x* = +*a* and *x* = −*a*, respectively, with classical limits μ± <sup>0</sup> . The "flea" is introduced as follows. If its support is in R+, we put

$$
\delta\_+ V = \begin{pmatrix} 0 \ 0 \\ 0 \ \delta \end{pmatrix},\tag{10.32}
$$

where δ ∈ R is a constant. A perturbation with support in R<sup>−</sup> is approximated by

$$
\delta\_- V = \begin{pmatrix} \delta \ 0 \\ 0 \ 0 \end{pmatrix}. \tag{10.33}
$$

Without loss of generality, take the latter (a change of sign of δ leads to the former). The eigenvalues of *<sup>H</sup>*(δ) <sup>=</sup> *<sup>H</sup>*<sup>0</sup> <sup>+</sup>δ−*<sup>V</sup>* are *<sup>E</sup>*<sup>0</sup> <sup>=</sup> *<sup>E</sup>*<sup>−</sup> and *<sup>E</sup>*<sup>1</sup> <sup>=</sup> *<sup>E</sup>*+, with energies

$$E\_{\pm} = \frac{1}{2} (\delta \pm \sqrt{\delta^2 + \Delta^2}),\tag{10.34}$$

and normalized eigenvectors

$$\boldsymbol{\mathfrak{op}}\_{\delta}^{(0)} = \frac{1}{\sqrt{2}} \left( \delta^2 + \Delta^2 + \delta \sqrt{\delta^2 + \Delta^2} \right)^{-1/2} \left( \frac{\Delta}{\delta + \sqrt{\delta^2 + \Delta^2}} \right); \quad (10.35)$$

$$\boldsymbol{\mathfrak{op}}\_{\delta}^{(1)} = \frac{1}{\sqrt{2}} \left( \delta^2 + \Delta^2 - \delta \sqrt{\delta^2 + \Delta^2} \right)^{-1/2} \left( \frac{\Delta}{\delta - \sqrt{\delta^2 + \Delta^2}} \right). \tag{10.36}$$

Note that limδ→<sup>0</sup> <sup>ϕ</sup>(*i*) <sup>δ</sup> <sup>=</sup> <sup>ϕ</sup>(*i*) <sup>0</sup> for *i* = 0,1. Now, if *h*¯ → 0, then |δ| >> Δ, in which case ϕ(0) <sup>δ</sup> → ϕ<sup>±</sup> <sup>0</sup> for ±δ > 0 (and starting from (10.32) instead of (10.33) would have given the opposite case, i.e., ϕ(0) <sup>δ</sup> → ϕ<sup>∓</sup> <sup>0</sup> for ±δ > 0). Thus the ground state localizes as *h*¯ → 0, which resembles the situation (10.25) - (10.26) for the full double-well.

In conclusion, in the (practically unavoidable) presence of asymmetric "flea" perturbations, *explicit* (rather than spontaneous) symmetry breaking already takes place for positive *h*¯, so that Butterfield's Principle holds, and hence also Earman's.

#### 10.3 Spontaneous symmetry breaking in quantum spin systems

Before discussing SSB in quantum spin systems, we return to ground states and KMS states as discussed in the generality of §§9.4–9.6. Starting with the former, it is natural to ask whether ground states are pure, as would be expected on physical grounds; indeed, this question goes to the heart of SSB. Proposition 9.20 implies that ground states (for given dynamics) form a compact convex subset *S*(*A*)<sup>∞</sup> of the total state space *S*(*A*); the notation *S*∞(*A*) (rather than e.g. *S*0(*A*)) will be motivated shortly by the analogy with equilibrium states. It would be desirable that

$$
\partial\_{\epsilon} S\_{\infty}(A) = S\_{\infty}(A) \cap \partial\_{\epsilon} S(A), \tag{10.37}
$$

in which case extreme ground states are necessarily pure. This will indeed be the case in the simple models we study in this book, but it is provably the case in general only under additional assumptions, such as *weak asymptotic abeliannnes* of the dynamics, i.e., lim*t*→<sup>∞</sup> ω([α*t*(*a*),*b*]) = 0 for all *a*,*b* ∈ *A*. A weaker sufficient condition for (10.37) is that πω(*A*) be commutative (which is the case if ω is pure).

We are now in a position to define SSB, at least in the context of ground states.

Definition 10.3. *Suppose we have a (topological) group G and a (continuous) homomorphism* γ : *G* → Aut(*A*)*, which is a symmetry of the dynamics in that*

$$
\mathfrak{a}\_{\mathfrak{t}} \circ \mathfrak{z}\_{\mathfrak{k}} = \mathfrak{z}\_{\mathfrak{k}} \circ \mathfrak{a}\_{\mathfrak{t}} \ (\mathfrak{g} \in G, \mathfrak{t} \in \mathbb{R}).\tag{10.38}
$$

*The G-symmetry is said to be* spontaneously broken *(at temperature T* = 0*) if*

$$(\partial\_{\epsilon} \mathbb{S}\_{\infty}(A))^{G} = \emptyset,\tag{10.39}$$

*and* weakly broken *if* (∂*eS*∞(*A*))*<sup>G</sup>* <sup>=</sup> <sup>∂</sup>*eS*∞(*A*)*, i.e., there is* at least one <sup>ω</sup> <sup>∈</sup> <sup>∂</sup>*eS*∞(*A*) *that fails to be G-invariant (although invariant extreme ground states may exist).*

Here <sup>S</sup> *<sup>G</sup>* <sup>=</sup> {<sup>ω</sup> <sup>∈</sup> <sup>S</sup> <sup>|</sup> <sup>ω</sup> ◦ <sup>γ</sup>*<sup>g</sup>* <sup>=</sup> <sup>ω</sup> <sup>∀</sup>*<sup>g</sup>* <sup>∈</sup> *<sup>G</sup>*}, defined for any subset <sup>S</sup> <sup>⊂</sup> *<sup>S</sup>*(*A*), is the set of *G*-invariant states in S . Assuming (10.37), eq. (10.39) means that there are no *pure G*-invariant ground states. This by no means implies that there are no *G*invariant ground states at all, quite to the contrary: for compact, or, more generally, amenable groups *G*, one can always construct *G*-invariant ground states by averaging over *G*, exploiting the fact that if *G* is a symmetry of the dynamics, then each affine homeomorphism γ∗ *<sup>g</sup>* of *S*(*A*) (defined by γ<sup>∗</sup> *<sup>g</sup>* (ω) = ω ◦γ*g*) maps *S*∞(*A*) to itself. Definition 10.3 therefore implies that if SSB occurs, then one has a dichotomy:

• *Pure ground states are not invariant, whilst invariant ground states are not pure*.

Definition 10.4. *We call a G-symmetry* spontaneously broken *at inverse temperature* β ∈ (0,∞) *if there are no G-invariant extreme* β*-*KMS *states, i.e.,*

$$(\partial\_{\epsilon}S\_{\beta}(A))^{G} = \emptyset,\tag{10.40}$$

*and* weakly broken *if there is* at least one *non-G-invariant extreme* KMS *state.*

By Theorem 9.31 we may replace *extreme* β-KMS states by *primary* β-KMS states, so that, similarly to ground states, SSB at nonzero temperature means that:

• *Primary* KMS *states are invariant, whilst invariant* KMS *states are not primary.*

For the next result, please recall Definition 9.10 and Theorem 9.11.

Proposition 10.5. *Let A be a quasi-local C\*-algebra of the kind* (8.130) *and suppose the given G-action* γ *commutes not only with time translations* α*<sup>t</sup> but also with space translation* τ*x. If* γ<sup>∗</sup> *<sup>g</sup>*ω = ω *for some* ω ∈ ∂*eS*<sup>β</sup> (*A*) *and g* ∈ *G, then the automorphism* γ*<sup>g</sup> cannot be unitarily implemented in the* GNS*-representation* πω*.*

This is true also at β = ∞, i.e., for ground states.

*Proof.* This is an obvious corollary of Proposition 9.13 and Theorems 9.14 and 9.31: if γ*<sup>g</sup>* were implementable by a unitary *ug*, then *ug*Ωω = Ωω (not even up to a phase), since γ∗ *<sup>g</sup>*<sup>ω</sup> <sup>=</sup> <sup>ω</sup>. But in that case, since <sup>τ</sup>*<sup>x</sup>* ◦ <sup>γ</sup>*<sup>g</sup>* <sup>=</sup> <sup>γ</sup>*<sup>g</sup>* ◦ <sup>τ</sup>*<sup>x</sup>* for each *<sup>x</sup>* <sup>∈</sup> <sup>Z</sup>*d*, we would have *uxug* = *ugux* and hence *ux*(*ug*Ωω) = *ug*Ωω. Thus *ug*Ωω would be another translation-invariant ground state, contradicting Theorem 9.14. -

This result is worth mentioning, since some authors *define* SSB through the conclusion of this proposition, that is, they call a symmetry γ*<sup>g</sup>* (spontaneously) broken by some state ω iff γ*<sup>g</sup>* cannot be unitarily implemented in πω. This definition seems physically dubious, however, because quantum spin systems may have ground states ω that are not *G*-invariant but in which nonetheless all of *G* is unitarily implementable (in such states translation invariance has to be broken, of course). For example, the Ising model in *d* = 1 with ferromagnetic nearest-neighbour interaction and vanishing external magnetic field (where *G* = Z2) has an infinite number of such ground states, in which a "domain wall" separates infinitely many "spins up" to the left from infinitely many "spins down" to the right. Although this model has a unique KMS state at any nonzero temperature, such ground states (and perhaps analogous states at β = ∞ in different models, so far understood only heuristically) seem far from pathological and play a major role in modern condensed matter physics. Hence we trust this alternative definition only if the states it singles out also satisfy Definition 10.3 or 10.4, for which Proposition 10.5 gives a sufficient condition: for translation-invariant states and symmetries on quasi-local algebras, our definition of SSB through (10.40) is compatible with the one based on unitary implementability.

This is fortunate, since the physicist's notion of an *order parameter*, through which at least *weak* SSB may be detected, is tailored to translation-invariant states:

Definition 10.6. *Let A be a quasi-local C\*-algebra A as in* (8.130)*, with symmetry group G. A* (strong) order parameter *in A is an n-tuple* <sup>φ</sup> = (φ1,...,φ*n*) <sup>∈</sup> *<sup>A</sup><sup>n</sup> for which* ω(φ) = 0 *if (and only if)* ω *is G-invariant, for any* Z*d*-invariant *state* ω *on A.*

An order parameter defines an accompanying vector field *x* → φ(*x*) by φ*i*(*x*) = τ*x*(φ). Since ω is translation-invariant, ω(φ) = 0 is equivalent to ω(φ(*x*)) = 0 for all *x*. In the Ising model, with *G* = Z2, σ3(0) is an order parameter, which can be extended to a strong one φ = (σ2(0),σ3(0)). In the Heisenberg model, where *G* = *SO*(3), the triple (σ1(0),σ2(0),σ3(0)) provides a strong order parameter.

Theorem 10.7. *Suppose that* φ *is a (strong) order parameter, as in Definition 10.6. Then a G-invariant and translation-invariant* KMS *state* <sup>ω</sup> <sup>∈</sup> *<sup>S</sup>*<sup>β</sup> (*A*)*<sup>G</sup> (including* <sup>β</sup> <sup>=</sup> ∞*, i.e., a ground state) displays weak* SSB*—in the sense that at least one of the components in its extremal decomposition fails to be G-invariant—if (and only if) the associated two-point function exhibits* long-range order*, in that*

$$\lim\_{\chi \to \infty} \text{co}\left(\sum\_{i=1}^{n} \phi\_i(\mathbf{0})^\* \phi\_i(\mathbf{x})\right) > 0. \tag{10.41}$$

*Proof.* The "if" part of the theorem is equivalent to the vanishing of the limit in question in the *absence* of SSB. Let (9.132) be the extremal decomposition of ω. If (almost) each extreme state ϕ is invariant, then ω (φ*i*(*x*)) = 0 for all *i* by definition of an order parameter, and similarly ω (φ*i*(*x*)∗) = ω (φ*i*(*x*)) = 0. Interchanging lim*x*→<sup>∞</sup> with the integral over ∂*S*<sup>β</sup> (*A*) (which is allowed because μ is a probability measure), and using (9.30) then shows that the left-hand side of (10.41) vanishes.

To avoid difficult measure-theoretic aspects of the extremal decomposition theory, and also for pedagogical purposes, we prove the "only if" part only in the case

$$\mathfrak{op} = \int\_{G} d\mathfrak{g} \,\mathfrak{o}'\_{\mathfrak{g}},\tag{10.42}$$

weakly, where ω ∈ ∂*S*<sup>β</sup> (*A*) and ω *<sup>g</sup>* = γ<sup>∗</sup> *g*ω . Since the expression

$$\left| \alpha\_{\S}^{\prime} (\sum\_{i=1}^{n} \phi\_{i}(0) \,^\* \phi\_{i}(\infty)) \right| $$

is independent of *g* ∈ *G* (by definition of an order parameter), we may replace ω *<sup>g</sup>* by ω in the expression for ω; the term *<sup>G</sup> dg* then factors out and is equal to unity. Thus we may replace ω in (10.41) by ω . Since ω is a primary state, we may now use (9.30) once again, so that the left-hand side of (10.41) becomes ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> |ω (φ*i*)| 2. By assumption, ω is not *G*-invariant, so that (by definition of a strong order parameter) at least one of the terms |ω (φ*i*)| is nonzero. -

If *G* is compact, for any C\*-algebra *A*, invariant KMS states (including ground states) can always be constructed via (9.133), provided, of course, KMS states (or ground states) exist in the first place. Fortunately, existence can be shown in the following way. Let *A* be a quasi-local C\*-algebra a la (8.130), in which: `


In that case, by Corollary 9.27 each C\*-algebra *A*<sup>Λ</sup> has a unique β-KMS state ω<sup>β</sup> Λ , given by the local Gibbs state (9.96). However, if <sup>Λ</sup>(1) <sup>⊂</sup> <sup>Λ</sup>(2) , then the restriction of the β-KMS state ω<sup>β</sup> <sup>Λ</sup>(2) to *<sup>A</sup>*Λ(1) <sup>⊂</sup> *<sup>A</sup>*Λ(2) is not given as naively expected, namely by the β-KMS state ω<sup>β</sup> Λ(1), because the former involves boundary terms.

Fortunately, this complication may be overcome, since at least *for models with short-range forces* (cf. Theorem 9.15) one may put

$$\mathfrak{o}\_G^\beta(a) = \lim\_{N \to \infty} \mathfrak{o}\_{\Lambda\_N}^\beta(a),\tag{10.43}$$

where <sup>Λ</sup>*<sup>N</sup>* is defined in (8.153). This limit exists for *<sup>a</sup>* ∈ ∪<sup>Λ</sup> *<sup>A</sup>*<sup>Λ</sup> , from which <sup>ω</sup><sup>β</sup> extends by continuity to all of *A*, on which it is a β-KMS state (cf. Theorem 10.10).

Alternatively, by the Hahn–Banach Theorem (in the form of Corollary B.41) combined with Lemma C.4 (which guarantees that any Hahn–Banach extension of a state remains a state), each local Gibbs state ω<sup>β</sup> <sup>Λ</sup> on *A*<sup>Λ</sup> ⊂ *A* extends, in a nonunique way, to a state ωˆ <sup>β</sup> <sup>Λ</sup> on *<sup>A</sup>*. This gives a net of states (ω<sup>ˆ</sup> <sup>β</sup> <sup>Λ</sup> ) on *A* indexed by the finite subsets Λ of Z*d*; one may also work with sequences (ωˆ <sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* ). Since *A* has a unit, its state space *S*(*A*) is a *compact* convex set, so the above net (or sequence) has at least one limit point, or, equivalently, has at least one convergent subnet (or subsequence), which—despite its potential lack of uniqueness in two respects, i.e. the choice of the extensions ωˆ <sup>β</sup> <sup>Λ</sup> and the choice of a limit point—one might write as

$$
\hat{\mathfrak{o}}^{\beta} = \lim\_{\Lambda \nearrow \mathbb{Z}^d} \hat{\mathfrak{o}}^{\beta}\_{\Lambda}. \tag{10.44}
$$

Without proof, we quote the relevant technical result (assuming 1–2 above):

#### Proposition 10.8. *Each limit state* ωˆ <sup>β</sup> *is a* β*-*KMS *state (i.e. for the dynamics* α*).*

Anticipating the existence of SSB in models, one should now feel a little uneasy:


#### 10.4 Spontaneous symmetry breaking for short-range forces

We continue our discussion of SSB in quantum spin systems, especially of the construction of global KMS states in the previous section, see (10.44) and preceding text. Recall that each finite system *A*<sup>Λ</sup> has a unique β-KMS state ω<sup>β</sup> <sup>Λ</sup> , namely the local Gibbs state (9.96), but that these states are incompatible for different Λ's, in that, if <sup>Λ</sup>(1) <sup>⊂</sup> <sup>Λ</sup>(2) , then the restriction of ω<sup>β</sup> <sup>Λ</sup>(2) to *<sup>A</sup>*Λ(1) <sup>⊂</sup> *<sup>A</sup>*Λ(2) is not given by <sup>ω</sup><sup>β</sup> Λ(1) because of boundary terms. To correct for this, one introduces the *surface energy*

$$b\_{\Lambda^{(1)},\Lambda^{(2)}} = \sum\_{X \subseteq \Lambda^{(2)} : X \cap \Lambda^{(1)} \neq \emptyset, X \cap \Lambda\_1^c \neq \emptyset} \Phi(X),\tag{10.45}$$

with ensuing *interaction energy*

$$b\_{\Lambda} = \lim\_{\Lambda^{(2)} \nearrow \mathbb{Z}^d} b\_{\Lambda, \Lambda^{(2)}} = \sum\_{X \cap \Lambda \ne \emptyset, X \cap \Lambda^c \ne \emptyset} \Phi(X), \tag{10.46}$$

provided this limit exists (which it does for short-range forces). Now perturb ω<sup>β</sup> Λ(2) by replacing *<sup>h</sup>*Λ(2) in (9.96) - (9.98) (with <sup>Λ</sup> <sup>Λ</sup>(2) ) by *h*Λ(2) −*b*Λ(1),Λ(2). Denoting this modification of ω<sup>β</sup> <sup>Λ</sup>(2) by <sup>ω</sup><sup>β</sup> <sup>Λ</sup>(1),Λ(2), we obtain (10.47), which implies (10.48):

$$\mathfrak{o}\_{\Lambda^{(\mathsf{I})},\Lambda^{(\mathsf{2})}}^{\mathcal{B}} = \mathfrak{o}\_{\Lambda^{(\mathsf{I})}}^{\mathcal{B}} \otimes \mathfrak{o}\_{\Lambda/\Lambda^{(\mathsf{I})}}^{\mathcal{B}};\tag{10.47}$$

$$(\mathfrak{o}\_{\Lambda^{(1)},\Lambda^{(2)}}^{\beta})\_{|\mathcal{A}\_{\Lambda^{(1)}}} = \mathfrak{o}\_{\Lambda^{(1)}}^{\beta}.\tag{10.48}$$

If (10.46) exists, we may likewise perturb any *t*-invariant state ω on *A* to ω˜<sup>Λ</sup> , i.e.,

$$\mathfrak{do}\_{\Lambda}(a) = \frac{\langle e^{-\mathfrak{f}(h\_{\mathfrak{o}} - \pi\_{\mathfrak{o}}(b\_{\Lambda}))/2} \mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}}, \pi\_{\mathfrak{o}}(a) e^{-\mathfrak{f}(h\_{\mathfrak{o}} - \pi\_{\mathfrak{o}}(b\_{\Lambda}))/2} \mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}} \rangle}{||e^{-\mathfrak{f}(h\_{\mathfrak{o}} - \pi\_{\mathfrak{o}}(b\_{\Lambda}))/2} \mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}}||^{2}},\tag{10.49}$$

where <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* is finite, *<sup>h</sup>*<sup>ω</sup> is defined as in (9.51) - (9.52), and Ωω is in the domain of the unbounded operator exp(−β(*h*<sup>ω</sup> −πω(*b*<sup>Λ</sup> ))/2); the reason is that πω(*b*<sup>Λ</sup> ) is bounded, whereas exp(−β*h*ω/2)Ωω = Ωω (since *h*ωΩω = 0). For example,

$$(\tilde{\mathfrak{o}}\_{\Lambda^{(2)}}^{\beta})\_{\Lambda^{(1)}} = \mathfrak{o}\_{\Lambda^{(1)},\Lambda^{(2)}}^{\beta},\tag{10.50}$$

where ω = ω<sup>β</sup> <sup>Λ</sup>(2) is a Gibbs state on *<sup>A</sup>* <sup>=</sup> *<sup>A</sup>*Λ(2), as in Theorem 9.24 (with <sup>Λ</sup> Λ(2) ). Indeed, using (9.114) - (9.117) and the relation *h*<sup>ω</sup> = *h*Λ(2) − *Jh*Λ(2)*J*, where the operator *J* is defined in (9.124), we compute the numerator in (10.49) as

$$\begin{split} &\operatorname{Tr}\left(\left(\left(e^{-\beta(h\_{A^{(2)}}-Jh\_{A^{(2)}})-J\_{\Lambda}}e^{-\beta h\_{A^{(2)}}/2}\right)^{\*}ae^{-\beta(h\_{A^{(2)}}-Jh\_{A^{(2)}})-J\_{\Lambda}}e^{-\beta h\_{A^{(2)}}/2}\right) \\ &=\operatorname{Tr}\left(e^{-\beta(h\_{A^{(2)}}-h\_{\Lambda})}a\right), \end{split} \tag{10.51}$$

since *Jh*Λ(2)*J* commutes with *h*Λ(2) −*b*<sup>Λ</sup> . This subsequently gives

$$e^{-\beta(h\_{\Lambda^{(2)}} - Jh\_{\Lambda^{(2)}}J - b\_{\Lambda})/2} = e^{-\beta(h\_{\Lambda^{(2)}} - b\_{\Lambda})/2} e^{\beta Jh\_{\Lambda^{(2)}}J};$$

$$e^{\beta Jh\_{\Lambda^{(2)}}J} e^{-\beta h\_{\Lambda^{(2)}}/2} = e^{-\beta h\_{\Lambda^{(2)}}/2} e^{\beta h\_{\Lambda^{(2)}}/2} = 1\_H. \tag{10.52}$$

Likewise, the denominator in (10.49) equals Tr(exp(−β(*h*Λ(2) −*b*<sup>Λ</sup> ))).

Eqs. (10.50) and (10.48) suggest that if ω = ω<sup>β</sup> is a β-KMS state, then although ω<sup>β</sup> itself does not localizes to a Gibbs state ω<sup>β</sup> <sup>Λ</sup> on *<sup>A</sup>*<sup>Λ</sup> , its perturbed version <sup>ω</sup>˜ <sup>β</sup> Λ does. Under assumptions 1–2 stated in §10.3, i.e., in the situation of Theorem 9.15 with dim(*H*) < ∞, this motivates the following quantum analogue of the DLR approach to classical equilibrium states, i.e., of Definition 9.23:

Definition 10.9. *For fixed inverse temperature* β ∈ R\{0} *and fixed interaction* Φ*, a* Gibbs state ω<sup>β</sup> *on a quasi-local algebra A with dynamics given by some potential* <sup>Φ</sup> *is an* <sup>α</sup>*t-independent state such that for each finite region* <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup> one has*

$$\mathfrak{o}\_{\Lambda}^{\beta} = \mathfrak{o}\_{\Lambda}^{\beta} \otimes \mathfrak{o}\_{\Lambda^c}^{\prime},\tag{10.53}$$

*where* ω<sup>β</sup> <sup>Λ</sup> *is the local Gibbs state* (9.96) *on A*<sup>Λ</sup> *and* ω <sup>Λ</sup>*<sup>c</sup> is some state on A*Λ*<sup>c</sup> .*

Theorem 10.10. *Under assumptions 1–2 in* §*10.3, and if in addition the subspace D* = ∪<sup>Λ</sup> *A*<sup>Λ</sup> ⊂ *A is a core for the derivation* (9.54) *(i.e., the closure of* δ *defined on D is* δ *as defined in Proposition 9.19), then Gibbs states coincide with* KMS *states.*

The proof is rather technical and so we omit it. It follows that if <sup>ω</sup><sup>β</sup> <sup>∈</sup> *<sup>S</sup>*<sup>β</sup> (*A*), then

$$(\tilde{\boldsymbol{\alpha}}\_{\Lambda}^{\beta})\_{|\boldsymbol{A}\_{\Lambda}} = \boldsymbol{\alpha}\_{\Lambda}^{\beta}. \tag{10.54}$$

Even so, we still need to define in precisely which sense the net ((ω˜ <sup>β</sup> <sup>Λ</sup> )|*A*<sup>Λ</sup> )<sup>Λ</sup> converges to ωΛ (or when perhaps even the net (ω<sup>β</sup> <sup>Λ</sup> ) converges to ωΛ ); for simplicity we take Λ = Λ*<sup>N</sup>* as in (8.153), and just consider sequences indexed by *N* (rather than nets). To this end, let (ω1/*N*)*<sup>N</sup>* be a sequence of states with ω1/*<sup>N</sup>* ∈ *S*(*A*Λ*<sup>N</sup>* ). As in Definition 8.24, given some ω<sup>0</sup> ∈ *S*(*A*) (if it exists), we say that

$$\lim\_{N \to \infty} \mathfrak{a}\_{\mathbb{I}/N} = \mathfrak{a}\_{\mathbb{0}} \tag{10.55}$$

iff for any sequence (*a*1/*N*)*<sup>N</sup>* in *A* with *a*1/*<sup>N</sup>* ∈ *A*Λ*<sup>N</sup>* ⊂ *A* that converges to *a* ∈ *A* one has

$$\lim\_{N \to \infty} \mathfrak{o}\_{1/N}(a\_{1/N}) = \mathfrak{o}\_0(a). \tag{10.56}$$

For example, if we take <sup>ω</sup><sup>0</sup> <sup>∈</sup> *<sup>S</sup>*(*A*) and define <sup>ω</sup>1/*<sup>N</sup>* <sup>=</sup> <sup>ω</sup>0|*A*Λ*<sup>N</sup>* , then (10.55) holds by continuity of ω<sup>0</sup> (as ω0 = 1), which implies that lim*N*→<sup>∞</sup> ω0(*a*1/*N*) = ω0(*a*).

It follows from the comments preceding Definition 8.24 that the above notion (10.55) - (10.56) of convergence is the same as the one given by (8.164), so that it is similar to the convergence of states we defined for the other two classes of examples of listed earlier, viz. classical mechanics (cf. §10.1) and thermodynamics.

We denote the restriction of some global KMS state <sup>ω</sup><sup>β</sup> (defined on *<sup>A</sup>*) to *<sup>A</sup>*Λ*<sup>N</sup>* <sup>⊂</sup> *<sup>A</sup>* by ω<sup>β</sup> <sup>1</sup>/*N*, whereas as usual we write <sup>ω</sup><sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* for the unique local Gibbs state on *A*Λ*<sup>N</sup>* . Keeping Definition 8.24 and Proposition 8.25 in mind, the situation is as follows:


The first claim follows from the argument given after (10.55). The second is the contrapositive to (10.54) and has been explained in §10.3: although the states <sup>ω</sup><sup>β</sup> 1/*N* and ω<sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* are both of local Gibbs type, their Hamiltonians differ from *h*Λ*<sup>N</sup>* by the boundary term *b*<sup>Λ</sup> . The third claim cannot be proved in general, but in models with short-range forces it holds in both forms (10.43) and (10.55) - (10.56). In such models the *G*-symmetry is local, i.e., *G* acts on each *A*<sup>Λ</sup> through unitaries

$$
\mu\_{\mathbf{g}}^{(\Lambda)} = \otimes\_{\mathbf{x} \in \Lambda} \mu\_{\mathbf{g}}(\mathbf{x});\tag{10.57}
$$

$$\eta\_{\mathcal{g}}^{(\Lambda)}(a\_{\Lambda}) = u\_{\mathcal{g}}^{(\Lambda)} a(u\_{\mathcal{g}}^{(\Lambda)})^\* \ (a\_{\Lambda} \in A\_{\Lambda}, \mathcal{g} \in G), \tag{10.58}$$

where *ug*(*x*) ∈ *B*(*Hx*), leaving each local Hamiltonian *h*<sup>Λ</sup> and hence each local Gibbs state ω<sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* invariant. If *a* ∈ *A* is local, i.e., *a* ∈ ∪<sup>Λ</sup> *A*<sup>Λ</sup> , then

$$\mathcal{Y}\_{\mathfrak{k}}(a) = \lim\_{N \to \infty} \mathcal{Y}\_{\mathfrak{k}}^{(A\_N)}(a\_N), \tag{10.59}$$

followed by continuous extension to *a* ∈ *A*, so that, assuming (10.55),

$$\mathfrak{a}\_0(\chi\_{\mathfrak{g}}(a)) = \lim\_{N \to \infty} \mathfrak{a}\_{1/N}(\chi\_{\mathfrak{g}}(a\_N)) = \lim\_{N \to \infty} \mathfrak{a}\_{1/N}(\chi\_{\mathfrak{g}}^{(\Lambda\_N)}(a\_N)) = \lim\_{N \to \infty} \mathfrak{a}\_{1/N}(a\_N) = \mathfrak{a}\_0(a),$$

since ω1/*<sup>N</sup>* ◦ γ (Λ*N*) *<sup>g</sup>* <sup>=</sup> <sup>ω</sup>1/*<sup>N</sup>* by assumption. Thus the global Gibbs state <sup>ω</sup><sup>β</sup> *<sup>G</sup>* inherits the *G*-invariance of its local approximants ω<sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* . In case of SSB, the restrictions <sup>ω</sup><sup>β</sup> 1/*N* of some non-invariant extreme KMS state ω<sup>β</sup> determine ω<sup>β</sup> , so that in principle SSB is detectable through the local states ω<sup>β</sup> <sup>1</sup>/*N*. It would be question-begging to construct the latter from the global states ω<sup>β</sup> , though, so Butterfield's Principle (and hence in its wake Earman's Principle) holds only if we can show how and why the states of sufficiently large yet *finite* systems *<sup>A</sup>*Λ*<sup>N</sup>* tend to <sup>ω</sup><sup>β</sup> <sup>1</sup>/*<sup>N</sup>* rather than to <sup>ω</sup><sup>β</sup> Λ*N* .

Unfortunately, showing any of this in specific models at finite (inverse) temperature 0 < β < ∞ is pretty complicated. For example, in the quantum Ising model (9.42) in *d* = 1, KMS states are unique for any *B*, so that for SSB one must go to *d* ≥ 2. In that case, it can be shown from Theorem 10.7 that for *B* = 0, below some critical temperature (i.e. for β > β*c*) the Z<sup>2</sup> symmetry defined in (10.68) below is broken, but this takes considerable effort and is beyond the scope of this book.

#### 10.5 Ground state(s) of the quantum Ising chain

It is much simpler to put β = ∞ and hence turn to the *ground state(s)* of the quantum Ising model (9.42) in *d* = 1, which is manageable. The interesting case is *B* > 0, with *J* = 1 and free boundary conditions, so that for Λ = Λ*<sup>N</sup>* (with *N* even), we have

$$h\_N = -\sum\_{\mathbf{x} \in \Lambda\_N} \left( \sigma\_3(\mathbf{x}) \sigma\_3(\mathbf{x} + 1) + B \sigma\_1(\mathbf{x}) \right);\tag{10.60}$$

$$A\_N = \{-\frac{1}{2}N, \dots, \frac{1}{2}N - 1\};\tag{10.61}$$

$$H\_{\Lambda\_N} = H\_N = \otimes\_{\mathbf{x} \in \Lambda\_N} H\_{\mathbf{x}};\tag{10.62}$$

$$H\_{\mathbf{x}} = \mathbb{C}^2 \ (\mathbf{x} \in \Lambda\_{\mathcal{N}}),\tag{10.63}$$

where the operator σ*i*(*x*) acts as the Pauli matrix σ*<sup>i</sup>* on *Hx* and as the unit matrix 12 elsewhere. This model describes a chain of *N* immobile spin- <sup>1</sup> <sup>2</sup> particles with ferromagnetic coupling in a transverse magnetic field (it is a special case of the socalled *XY*-model, to which similar conclusions apply). The local Hamiltonians *hN* define time evolution on the local algebras

$$A\_{\rm AN} \equiv A\_N = B(H\_N) \tag{10.64}$$

by (9.40), i.e.,

$$\mathfrak{a}\_{\mathfrak{r}}^{(N)}(a\_N) = e^{ith\_N} a\_N e^{-ith\_N} \ (a \in A\_N),\tag{10.65}$$

which by Theorem 9.15 defines a time evolution on the quasi-local C\*-algebra

$$A = \overline{\bigcup\_{N \in \mathbb{N}} A\_N}^{\parallel \cdot \parallel} = \bigotimes\_{\mathbf{x} \in \mathbb{Z}} B(H\_{\mathbf{x}}),\tag{10.66}$$

namely by regarding the unitaries exp(*ithN*) ∈ *AN* ⊂ *A* as elements of *A* and putting

$$\alpha\_t(a) = \lim\_{N \to \infty} e^{ith\_N} a e^{-ith\_N} \ (a \in A), \tag{10.67}$$

which exists (although the sequence (exp(*ithN*))*<sup>N</sup>* in *A* does not converge in *A*).

For any *B* ∈ R, the quantum Ising chain has a Z2-symmetry given by a 180 degree rotation around the *x*-axis, locally implemented by the unitary operator *u*(*x*) = σ1(*x*), which at each *x* ∈ Λ*<sup>N</sup>* yields (σ1,σ2,σ3) → (σ1,−σ2,−σ3), since σ*i*σ*j*σ<sup>∗</sup> *<sup>i</sup>* = −σ*<sup>j</sup>* for *i* = *j*. Thus *u*(*x*) sends each σ3(*x*) to −σ3(*x*) but maps each σ1(*x*) to itself. As in (10.57), this symmetry is implemented by the unitary operator

$$\mu^{(N)} = \copysim\_{\mathfrak{x} \in \mathsf{A}\_N} \mathsf{σ}\_1(\mathfrak{x}) \tag{10.68}$$

on *HN*, which satisfies [*hN*,*u*(*N*) ] = 0, or, equivalently,

$$
u^{(N)}h\_N(\
u^{(N)})^\* = h\_N.\tag{10.69}$$

The ensuing Z2-symmetry is given by the automorphism γ(*N*) of *AN* defined by

$$\gamma^{(N)}(a) = \mu^{(N)} a (\mu^{(N)})^\* \ (a \in A\_N), \tag{10.70}$$

which induces a global automorphism γ ∈ Aut(*A*) as in (10.59), i.e.,

$$\mathcal{Y}(a) = \lim\_{N \to \infty} u^{(N)} a(u^{(N)})^\* \ (a \in A), \tag{10.71}$$

which limit once again exists despite the fact that the sequence *u*(*N*) has no limit in *A*. Thus Z2-invariance of the model follows from the local property

$$
\alpha\_{\mathfrak{r}}^{(N)} \circ \mathfrak{Y}^{(N)} = \mathfrak{Y}^{(N)} \circ \mathfrak{a}\_{\mathfrak{r}}^{(N)},\tag{10.72}
$$

which in the limit *N* → ∞ gives

$$
\alpha \mathfrak{a} \circ \lambda = \mathfrak{y} \circ \mathfrak{a} \quad (\mathfrak{r} \in \mathbb{R}).\tag{10.73}
$$

Since <sup>γ</sup><sup>2</sup> <sup>=</sup> id*A*, we have an action of the group <sup>Z</sup><sup>2</sup> <sup>=</sup> {−1,1} on *<sup>A</sup>*, where the nontrivial element (i.e., *g* = −1) is sent to γ. By (10.72) this group acts on the set *S*∞(*AN*) of ground states of *AN* relative to the dynamics α(*N*) , and by (10.73) the same is true for the set *S*∞(*A*) of ground states of the corresponding infinite system for α (and analogously for β-KMS states). These sets may be described as follows.


$$a\_0^{(0)} = \,\_2^!(a\_0^+ + a\_0^-) \tag{10.74}$$

*form a continuous field of states on the continuous bundle A*(*q*) *; in particular,*

$$\lim\_{N \to \infty} \mathfrak{o}\_{1/N}^{(0)} = \mathfrak{o}\_0^{(0)}.\tag{10.75}$$

The two ground states in no. 1 and no. 3 are tensor products of | ↑ and | ↓, respectively (where σ3| ↑ = | ↑ and σ3| ↓ = −| ↓), so that σ3(0) is an order parameter in the sense of Definition 10.6. In no. 4, on the other hand, each spin aligns with the magnetic field in the *x*-direction, so that the ground state is an infinite tensor product of states | →, where σ1| → = | →, and this time σ1(0) is an order parameter.

Case no. 2 becomes more transparent if we realize the Hilbert space *HN* as -<sup>2</sup>(*SN*), where *SN* is the set of all spin configurations *s* on *N* sites, that is,

$$s: \{ -\frac{1}{2}N, -\frac{1}{2}N+1, \dots, \frac{1}{2}N-1 \} \to \{ -1, 1 \}.$$

In terms of the eigenvectors |1 ≡ | ↑ and | −1 ≡ | ↓ of σ3, and the orthonormal basis (δ*s*)*s*∈*SN* of -<sup>2</sup>(*SN*) (where δ*s*(*t*) = δ*st*), a suitable unitary equivalence

$$\nu\_N: \ell^2(\mathbb{S}\_N) \to H\_N \tag{10.76}$$

is given by linear extension of

$$\mathbf{v}\_{N}\mathbf{\hat{o}}\_{s} = |\mathbf{s}(-\frac{1}{2}N)\cdots\mathbf{s}(\frac{1}{2}N-1)\rangle,\ \mathbf{s},t \in \mathbb{S}\_{N}.\tag{10.77}$$

For example, the state |1···1 corresponds to δ*s*<sup>↑</sup> , where *s*↑(*x*) = 1 for all *x*, and analogously *s*↓(*x*) = −1 for the state | − 1···− 1. Using -<sup>2</sup>(*SN*), we may talk of localization of states in spin configuration space (similar to localization of wavefunctions in *<sup>L</sup>*2(R*n*)), in the sense that some <sup>ψ</sup> <sup>∈</sup> -<sup>2</sup>(*SN*) may be peaked on just a few spins configurations. Provided 0 < *B* < 1 this is indeed the case for the unique ground state in case no. 2, which is similar to the ground state of the double-well potential discussed in §§10.1–10.2, replacing R by *SN* (and *h*¯ > 0 by 1/*N*).

Theorem 10.11 and related results used below, such as eq. (10.82), follow from the exact solution of the model for both *N* < ∞ and *N* = ∞, to be discussed in §§10.6–10.7. This solution is rather involved, but a rough picture of the various ground states may already be obtained from a classical approximation in the spirit of §8.1. This approximation assumes that the spin-1/2 operators <sup>1</sup> <sup>2</sup>σ*<sup>i</sup>* are replaced by their counterparts for spin *n* · <sup>1</sup> <sup>2</sup> , upon which one takes the limit *n* → ∞. In this limit, the spin operators are turned into the corresponding coordinate functions on the coadjoint orbit <sup>O</sup>1/<sup>2</sup> <sup>⊂</sup> <sup>R</sup><sup>3</sup> for *SU*(2), which is the two-sphere *<sup>S</sup>*<sup>2</sup> <sup>1</sup>/<sup>2</sup> with radius *r* = 1/2. In principle, this should be done for each of the *N* spins separately, yielding a classical Hamiltonian *hc* that is a function on the *N*-fold cartesian product of *S*<sup>2</sup> 1/2 with itself. However, if we *a priori* assume translation invariance of the classical ground state, only one such copy remains. Using spherical coordinates

$$\theta \left( x = \frac{1}{2} \sin \theta \cos \phi, y = \frac{1}{2} \sin \theta \sin \phi, z = \frac{1}{2} \cos \theta \right), \tag{10.78}$$

the ensuing trial Hamiltonian becomes just a function on O1/2, given by

$$h(\theta, \phi) \approx -(\frac{1}{2}\cos^2\theta + B\sin\theta\cos\phi). \tag{10.79}$$

Minimizing gives cosφ = 1 and hence *y* = 0 for any *B*, upon which

$$h(\theta) \approx -\left(\frac{1}{2}\cos^2\theta + B\sin\theta\right) \tag{10.80}$$

yields the phase portrait of Theorem 10.11 for *N* = ∞, as follows. For 0 ≤ *B* < 1, the global minimum is reached at the two different solutions θ<sup>±</sup> of cosθ<sup>±</sup> = *B*, with ensuing spin vectors

$$\mathbf{x}\_{\pm}(\mathcal{B}) = (\frac{1}{2}\mathcal{B}, 0, \pm\frac{1}{2}\sqrt{1 - \mathcal{B}^2}),\tag{10.81}$$

starting at x±(0)=(0,0,±<sup>1</sup> <sup>2</sup> ) and merging at *B* = 1 to x+(1) = x−(1)=( <sup>1</sup> <sup>2</sup> ,0,0). This remains the unique ground state for *B* ≥ 1, where all spins align with the field.

In the regime 0 < *B* < 1 with large but finite *N*, one finds a far-reaching analogy between the double-well potential and and the quantum Ising chain, namely:


$$
\Delta\_N \approx (1 - B^2) B^N \ (N \to \ast \ast), \tag{10.82}
$$

for the quantum Ising chain. Thus both models show exponential decay, i.e. of (10.82) in *N* as *N* → ∞, and of (10.13) in 1/*h*¯ as *h*¯ → 0.

It should be mentioned that *exponential* decay of the energy gap seems a lowdimensional luxury, which is not really needed for SSB. All that counts is that lim*N*→<sup>∞</sup> Δ*<sup>N</sup>* = 0, which guarantees that the first excited state is asymptotically degenerate with the ground state, so that appropriate linear combinations like ω± <sup>0</sup> can be formed that converge to the degenerate symmetry-breaking *pure* (and hence physical) ground states (or *extreme* and hence physical KMS states) of the limit system, which are localized and stable (as is clear from the double well). The fact that in the two models at hand only *one* excited state participates in this mechanism is due to the simple Z<sup>2</sup> symmetry that is being broken; SSB of continuous symmetries requires a large number of low-lying states that are asymptotically degenerate with the ground state and hence also with each other—one speaks of a *thin* energy spectrum).

The existence of low-lying excited states may be proved abstractly (i.e., in a model-independent way), as follows. For *N* < ∞, let ψ(0) *<sup>N</sup>* be the ground state (assumed unique) of some model defined on <sup>Λ</sup>*<sup>N</sup>* <sup>⊂</sup> <sup>Z</sup>*d*, and let <sup>φ</sup> be an order parameter (cf. Theorem 10.7) with accompanying vector field Φ*<sup>N</sup>* = ∑*x*∈Λ*<sup>N</sup>* φ(*x*); in the quantum Ising chain, we take φ = σ1. Then the key assumptions are expressed by

$$
\langle \Psi\_N^{(0)}, \Phi\_N \Psi\_N^{(0)} \rangle = 0; \tag{10.83}
$$

$$<\langle \Phi\_N \Psi\_N^{(0)}, \Phi\_N \Psi\_N^{(0)} \rangle \ge \mathcal{C}\_1 \cdot N^2 \ (N \to \ast, \mathcal{C}\_1 > 0);\tag{10.84}$$

$$\left| \left| \left[ [\Phi\_N, h\_N], \Phi\_N \right] \right| \right| \le C\_2 \cdot N \text{ ( $N \to \approx$ ,  $C\_2 > 0$ )}.\tag{10.85}$$

The first states that the ground state is symmetric, the second enforces long-range order, as in (10.41), and the third follows from having short-range forces. A simple computation then shows that the unit vector ψ˜ (1) *<sup>N</sup>* <sup>=</sup> <sup>Φ</sup>*N*ψ(0) *<sup>N</sup>* /Φ*N*ψ(0) *<sup>N</sup>* satisfies

$$
\langle \Psi\_N^{(1)}, h\_N \Psi\_N^{(1)} \rangle - \langle \Psi\_N^{(0)}, h\_N \Psi\_N^{(0)} \rangle \le C\_2 / \langle C\_1 N \rangle \ (N \to \infty). \tag{10.86}
$$

Since ψ˜ (1) *<sup>N</sup>* is orthogonal to <sup>ψ</sup>(0) *<sup>N</sup>* by (10.83), the variational principle for eigenvalues (note that *hN* has discrete spectrum, as dim(*H*Λ*<sup>N</sup>* ) < ∞) then gives Δ*<sup>N</sup>* ≤ *C*2/(*C*1*N*), so that Δ*<sup>N</sup>* vanishes as *N* → ∞, though perhaps not as quickly as (10.82) indicates.

#### 10.6 Exact solution of the quantum Ising chain: *N* < ∞

The solution of the quantum Ising chain is based on a transformation to fermionic variables. Let *H* be a Hilbert space and let *F*−(*H*) be its *fermionic Fock space*, i.e.,

$$F\_{-}(H) = \oplus\_{k=0}^{\infty} H\_{-}^{k},\tag{10.87}$$

where *H*<sup>0</sup> = C, and for *k* > 0 the Hilbert space *H<sup>k</sup>* <sup>−</sup> = *e* (*k*) <sup>−</sup> *<sup>H</sup><sup>k</sup>* is the totally antisymmetrized *k*-fold tensor product of *H* with itself, see also §7.7. Here the projection *e* (*k*) <sup>−</sup> : *<sup>H</sup><sup>k</sup>* <sup>→</sup> *<sup>H</sup><sup>k</sup>* is defined by linear extension of

$$e\_{-}^{(k)}f\_{1}\otimes\cdots\otimes f\_{k} = \frac{1}{k!} \sum\_{p\in\mathfrak{S}\_{k}} \text{sgn}(p) f\_{p(1)}\otimes\cdots\otimes f\_{p(k)},\tag{10.88}$$

where <sup>S</sup>*<sup>k</sup>* is the permutation group on *<sup>k</sup>* objects, and sgn(*p*) is <sup>+</sup>1/ <sup>−</sup> 1 if *<sup>p</sup>* is an even/odd permutation. With the (total) Fock space *<sup>F</sup>*(*H*) = <sup>⊕</sup><sup>∞</sup> *<sup>k</sup>*=0*H<sup>k</sup>* we have *F*−(*H*) = *e*−*F*(*H*), where *e* = ∑*<sup>k</sup> e* (*k*) <sup>−</sup> (strongly) is a projection. For *f* ∈ *H* we define the (unbounded) *annihilation operator a*(*f*) on *F*(*H*) by (finite) linear extension of

$$a(f)f\_1 \otimes \cdots \otimes f\_k = \sqrt{k} \langle f, f\_1 \rangle\_H \otimes \cdots \otimes f\_k,\tag{10.89}$$

for *<sup>k</sup>* <sup>&</sup>gt; 0, with *<sup>a</sup>*(*f*)*<sup>z</sup>* <sup>=</sup> 0 on *<sup>H</sup>*<sup>0</sup> <sup>=</sup> <sup>C</sup>. This gives the adjoint *<sup>a</sup>*(*f*)<sup>∗</sup> <sup>≡</sup> *<sup>a</sup>*∗(*f*) as

$$a^\*(f)f\_1 \otimes \cdots \otimes f\_k = \sqrt{k+1}f \otimes f\_1 \otimes \cdots \otimes f\_k. \tag{10.90}$$

For each *f* ∈ *H*, we then define the following operators on *F*−(*H*):

$$c(f) = e\_-a(f)e\_-;\tag{10.91}$$

$$c^\*(f) = e\\_{a^\*}(f)e\\_{.} \tag{10.92}$$

Note that the map *f* → *c*(*f*) is *antilinear* in *f* , whereas *f* → *a*∗(*f*) is *linear* in *f* . It follows that *c*∗(*f*) = *c*(*f*)∗, that each operator *c*(*f*) and *c*(*f*) on *F*−(*H*) is bounded with *c*(*f*) = *c*∗(*f*) = *f* , and the *canonical anticommutation relations* hold:

$$[c(f), c^\*(g)]\_+ = \langle f, g \rangle\_{H} \cdot 1\_{F\_{-}(H)};\tag{10.93}$$

$$[c(f), c(g)]\_{+} = [c^\*(f), c^\*(g)]\_{+} = 0. \tag{10.94}$$

Thus we may define CAR(*H*) as the C\*-algebra within *B*(*F*−(*H*)) generated by all *c*(*f*), where *f* ∈ *H*. This is called the C\*-algebra of *canonical anticommutation relations* over *H*, which have constructed in its defining representation on *F*−(*H*). Choosing an orthonormal basis (*ei*) of *H* and writing *c*(*ei*) = *ci* etc. clearly yields

$$[c\_i, c\_f^\*]\_+ = \delta\_{if} \cdot 1\_{F\_-(H)};\tag{10.95}$$

$$[c\_l, c\_j]\_+ = [c\_l^\*, c\_j^\*]\_+ = 0.\tag{10.96}$$

If dim(*H*) = *N* < ∞, then CAR(*H*) = *B*(*F*−(*H*)). First, a dimension count yields

$$F\_{-}(\mathbb{C}^{N}) = \oplus\_{k=0}^{N} H\_{-}^{k} \cong \mathbb{C}^{2^{N}} \cong \otimes^{N} \mathbb{C}^{2}.\tag{10.97}$$

By Theorem C.90, the C\*-algebra CAR(*H*) acts irreducibly on *F*−(*H*), so that

$$\mathbf{CAR}(\mathbb{C}^{N}) \cong \mathbf{M}\_{2^{N}}(\mathbb{C}).\tag{10.98}$$

This is already nontrivial for *<sup>N</sup>* <sup>=</sup> 1. In that case, *<sup>F</sup>*−(C) = <sup>C</sup>⊕<sup>C</sup> <sup>=</sup> <sup>C</sup>2, and

$$c = \sigma\_- = \begin{pmatrix} 0 \ 0 \\ 1 \ 0 \end{pmatrix};\tag{10.99}$$

$$c^\* = \mathfrak{o}\_+ = \begin{pmatrix} 0 \ 1 \\ 0 \ 0 \end{pmatrix},\tag{10.100}$$

where σ<sup>±</sup> = <sup>1</sup> <sup>2</sup> (σ<sup>1</sup> ±*i*σ2). This realization explicitly shows that

$$\mathbf{CAR}(\mathbb{C}) = M\_2(\mathbb{C}).\tag{10.101}$$

To generalize this to *N* > 1, we introduce a lattice (or chain) *N* = {1,...,*N*}, and for each *x* ∈ *N* we define operators *cx*, *c*<sup>∗</sup> *<sup>x</sup>* by the *Jordan–Wigner transformation*

$$\sigma\_{\mathbf{x}} = e^{\pi i \sum\_{\mathbf{y}=1}^{x-1} \sigma\_{+}(\mathbf{y}) \sigma\_{-}(\mathbf{y})} \sigma\_{-}(\mathbf{x}) = \left( \prod\_{\mathbf{y}=1}^{x-1} (-\sigma\_{3})(\mathbf{y}) \right) \cdot \sigma\_{-}(\mathbf{x});\tag{10.102}$$

$$c\_{\mathbf{x}}^{\*} = e^{-\pi i \sum\_{\mathbf{y}=1}^{\mathbf{x}-1} \sigma\_{+}(\mathbf{y}) \sigma\_{-}(\mathbf{y})} \sigma\_{+}(\mathbf{x}) = \left(\prod\_{\mathbf{y}=1}^{\mathbf{x}-1} (-\sigma\_{3})(\mathbf{y})\right) \cdot \sigma\_{+}(\mathbf{x}), \qquad (10.103)$$

where *x* > 1, and *c*<sup>1</sup> = σ<sup>−</sup> <sup>1</sup> and *c*<sup>∗</sup> <sup>1</sup> = σ<sup>+</sup> <sup>1</sup> (here σ±(*x*) = <sup>1</sup> <sup>2</sup> (σ1(*x*) ± *i*σ2(*x*)) etc.). These operators satisfy (10.95) - (10.96); the second expression on each line follows because the operators σ+(*y*)σ−(*y*) commute for different sites *y*, and

$$e^{\pi i \sigma^+ \sigma^-} = -\sigma\_3. \tag{10.104}$$

Furthermore, since

$$\mathbf{c}\_x^\* \mathbf{c}\_x = \sigma\_+(\mathbf{x}) \sigma\_-(\mathbf{x}) = \begin{pmatrix} 1 \ 0 \\ 0 \ 0 \end{pmatrix} (\mathbf{x});\tag{10.105}$$

$$c\_{\mathbf{x}}c\_{\mathbf{x}}^{\*} = \sigma\_{-}(\mathbf{x})\sigma\_{+}(\mathbf{x}) = \begin{pmatrix} 0 \ 0 \\ 0 \ 1 \end{pmatrix}(\mathbf{x}),\tag{10.106}$$

the inverse of the Jordan–Wigner transformation is given by

$$
\sigma\_{-}(\mathbf{x}) = e^{-\pi i \Sigma\_{\gamma=1}^{x-1} c\_{\gamma}^\* c\_{\gamma}} c\_{\mathbf{x}};\tag{10.107}
$$

$$\sigma\_{+}(\mathbf{x}) = c\_{\mathbf{x}}^{\*} e^{\pi l \Sigma\_{\mathbf{y}=1}^{x-1} c\_{\mathbf{y}}^{\*} c\_{\mathbf{y}}}.\tag{10.108}$$

We return to the quantum Ising model (10.60) with free boundary conditions, where we relabel the sites as {1,...,*N*}, as above, and change to the Hamiltonian

$$h\_N^{\rm QI} = -\frac{1}{2} \left( \sum\_{\mathbf{x}=1}^{N-1} \sigma\_\mathbf{l}(\mathbf{x}) \sigma\_\mathbf{l}(\mathbf{x}+1) + \lambda \sum\_{\mathbf{x}=1}^{N} \sigma\_\mathbf{\hat{3}}(\mathbf{x}) \right), \tag{10.109}$$

where, in order to avoid notational confusion with the operator *B* in (10.111) below, we henceforth replace *B* λ. In terms of the unitary operator *u* = 1/2(12 +*i*σ2) on <sup>C</sup><sup>2</sup> and hence *<sup>u</sup>*(*N*) <sup>=</sup> <sup>⊗</sup>*<sup>N</sup> <sup>x</sup>*=1*u*(*x*) on <sup>⊗</sup>*N*C2, we have *<sup>u</sup>*(*N*) *hN*(*u*(*N*) )∗ = *h N*.

Using (10.102) - (10.103), up to an additive constant λ*N* · 1*<sup>N</sup>* we omit, we find

$$h\_N^{\rm QI} = -\sum\_{\mathbf{x}=1}^N (\lambda c\_{\mathbf{x}}^\* c\_{\mathbf{x}} + \frac{1}{2} (c\_{\mathbf{x}}^\* - c\_{\mathbf{x}})(c\_{\mathbf{x}+1}^\* + c\_{\mathbf{x}+1})),\tag{10.110}$$

so we now show how to diagonalize quadratic fermionic Hamiltonians of the type

$$h\_N = -\sum\_{\mathbf{x}, \mathbf{y} = 1}^N \left( A\_{\mathbf{x}\mathbf{y}} c\_{\mathbf{x}}^\* c\_{\mathbf{y}} + \frac{1}{2} B\_{\mathbf{x}\mathbf{y}} (c\_{\mathbf{x}}^\* c\_{\mathbf{y}}^\* - c\_{\mathbf{x}} c\_{\mathbf{y}}) \right), \tag{10.111}$$

where *A* and *B* are real *N* ×*N* matrices, with *A*<sup>∗</sup> = *A* and *B*<sup>∗</sup> = −*B*. Indeed, taking

$$A = \frac{1}{2}(\mathbf{S} + \mathbf{S}^\*) + \mathbf{\hat{\mathcal{A}}} \cdot \mathbf{1}\_N;\tag{10.112}$$

$$B = \frac{1}{2}(S - S^\*),\tag{10.113}$$

recovers (10.110), where *<sup>S</sup>* : <sup>C</sup>*<sup>N</sup>* <sup>→</sup> <sup>C</sup>*<sup>N</sup>* is the *shift operator*, defined by

$$Sf(\mathbf{x}) = f(\mathbf{x} + 1);\tag{10.114}$$

$$S^\*f(\mathbf{x}) = f(\mathbf{x} - \mathbf{1}).\tag{10.115}$$

By convention, *<sup>f</sup>*(*<sup>N</sup>* <sup>+</sup> <sup>1</sup>) = *<sup>f</sup>*(0) = 0 (i.e., *S f*(*N*) = *<sup>S</sup>*<sup>∗</sup> *<sup>f</sup>*(0) = 0 for any *<sup>f</sup>* <sup>∈</sup> <sup>C</sup>*N*); in terms of the standard basis (υ*x*) of <sup>C</sup>*<sup>N</sup>* we have *<sup>S</sup>*υ<sup>1</sup> <sup>=</sup> 0 and *<sup>S</sup>*υ*<sup>x</sup>* <sup>=</sup> <sup>υ</sup>*x*−<sup>1</sup> for *x* = {2,...,*N*}, and likewise *S*∗υ*<sup>N</sup>* = 0 and *S*υ*<sup>x</sup>* = υ*x*+<sup>1</sup> for *x* = {1,...,*N* −1}.

The smart thing to do now turns out to be diagonalizing the 2*N* ×2*N*-matrix

$$M = \begin{pmatrix} A & B \\ -B & -A \end{pmatrix},\tag{10.116}$$

which by a unitary transformation may be brought into the simpler form

$$M' = \begin{pmatrix} \sqrt{1/2} - \sqrt{1/2} \\ \sqrt{1/2} \ \sqrt{1/2} \end{pmatrix} \begin{pmatrix} A & B \\ -B - A \end{pmatrix} \begin{pmatrix} \sqrt{1/2} & \sqrt{1/2} \\ -\sqrt{1/2} & \sqrt{1/2} \end{pmatrix} = \begin{pmatrix} 0 & C \\ C^\* & 0 \end{pmatrix},\tag{10.117}$$

where *C* = *A*+*B*. For example, for the model (10.111) we simply have

$$\mathbf{C} = \mathbf{S} + \mathbf{\hat{\mathcal{A}}} \cdot \mathbf{1}\_N. \tag{10.118}$$

The equations for the eigenvalues ε*<sup>k</sup>* and eigenvectors of *M* , i.e.,

$$M'\begin{pmatrix}\boldsymbol{\varrho}\_{k} \\ \boldsymbol{\Psi}\_{k}\end{pmatrix} = \boldsymbol{\varepsilon}\_{k} \begin{pmatrix} \boldsymbol{\varrho}\_{k} \\ \boldsymbol{\Psi}\_{k} \end{pmatrix} \tag{10.119}$$

where <sup>ϕ</sup>*k*,ψ*<sup>k</sup>* <sup>∈</sup> <sup>C</sup>*N*, are equivalent to both the coupled system of equations

$$C\Psi\_k = \varepsilon\_k \Phi\_k;\tag{10.120}$$

$$C^\*\mathfrak{g}\_k = \mathfrak{e}\_k \mathfrak{w}\_k;\tag{10.121}$$

$$C = A + B,\tag{10.122}$$

where the eigenvalues ε*<sup>k</sup>* are real (since *M*<sup>∗</sup> = *M*), and to the uncoupled version

$$\mathcal{C}\mathcal{C}^\*\mathfrak{g}\_{\mathbb{k}} = \mathfrak{e}\_{\mathbb{k}}^2 \mathfrak{g}\_{\mathbb{k}};\tag{10.123}$$

$$C^\*C\Psi\_k = \mathfrak{e}\_k^2 \Psi\_k;\tag{10.124}$$

$$\text{CC}^\* = A^2 - B^2 - [A, B];\tag{10.125}$$

$$\mathbf{C}^\*\mathbf{C} = \mathbf{A}^2 - \mathbf{B}^2 + [\mathbf{A}, \mathbf{B}].\tag{10.126}$$

Without loss of generality we may (and will) assume that the ϕ*k*,ψ*<sup>k</sup>* are unit vectors in C*N*, so that the corresponding unit vector in C2*<sup>N</sup>* is (ϕ*k*,ψ*k*)/ √ 2). Furthermore, since *C* (or *M*) is a matrix with real entries and the ε*<sup>k</sup>* are real, by a suitable choice of phase we may (and will) also arrange that ϕ*k*,ψ*<sup>k</sup>* have real components. Finally, it follows from (10.120) - (10.120) that (−ϕ*k*,ψ*k*) is an eigenvector of *C* with eigenvalue −ε*k*, so that the unitary transformation *U* that diagonalizes *M* , i.e.,

$$(U')^{-1}M'U' = \begin{pmatrix} -E & 0\\ 0 & E \end{pmatrix},\tag{10.127}$$

where *E* = diag(ε1,..., ε*N*), takes the form

$$U' = \frac{1}{\sqrt{2}} \begin{pmatrix} \varphi \ -\varphi \\ \Psi \ \Psi' \end{pmatrix},\tag{10.128}$$

where ϕ is the *N* × *N* matrix (ϕ1,...,ϕ*N*), seeing each vector ϕ*<sup>i</sup>* as a column, etc. Combined with (10.117), we obtain

$$U^{-1}MU = \begin{pmatrix} -E & 0\\ 0 & E \end{pmatrix};\tag{10.129}$$

$$U = \frac{1}{2} \begin{pmatrix} 1 & 1 \\ -1 & 1 \end{pmatrix} \cdot \begin{pmatrix} \mathfrak{g} \ -\mathfrak{g} \\ \Psi \ \Psi \end{pmatrix} = \frac{1}{2} \begin{pmatrix} \Psi + \mathfrak{g} \ \Psi - \mathfrak{g} \\ \Psi - \mathfrak{g} \ \Psi + \mathfrak{g} \end{pmatrix} \equiv \begin{pmatrix} u \ \nu \\ \nu \ u \end{pmatrix},\tag{10.130}$$

where we introduced *N* ×*N* matrices

$$
\mu = \frac{1}{2}(\Psi + \Phi);\tag{10.131}
$$

$$
\nu = \frac{1}{2}(\Psi - \Phi). \tag{10.132}
$$

Using orthonormality and completeness of both the (ϕ*k*) and the (ψ*k*), one obtains

$$
\mu^\* \mu + \nu^\* \nu = 1\_H;\tag{10.133}
$$

$$
\mu^\* \nu + \nu^\* \mu = 0; \tag{10.134}
$$

$$
\mu\mu^\* + \nu\nu^\* = \mathbb{I}\_H;\tag{10.135}
$$

$$
\mu \nu^\circ + \nu \mu^\circ = 0.\tag{10.136}
$$

Of course, *u* and *v* are far from unique, as they depend on both the ordering and the phases of the vectors ϕ*<sup>k</sup>* and ψ*k*. In partial remedy of the former ambiguity we assume that 0 ≤ ε<sup>0</sup> ≤ ε<sup>1</sup> ≤···≤ ε*<sup>N</sup>* (which can be arranged by a suitable ordering as well as choice of sign of the eigenvectors ϕ*k*). Towards the latter, we already agreed that both the ϕ*<sup>k</sup>* and ψ*<sup>k</sup>* are real, so that also our matrices *u* and *v* have real entries.

We now explain the purpose of diagonalizing *M* in (10.116) using *u* and *v*.

Proposition 10.12. *Let* u *and* v *be operators on a Hilbert space H, where* u *is* linear *and* v *is* anti-linear*. Let c*(*f*) *and c*∗(*f*) *be the operators* (10.91) *-* (10.92)*, satisfying the CAR* (10.93) *-* (10.94)*. Define the* Bogoliubov transformation

$$
\eta(f) = c(\mathfrak{u}f) + c^\*(\mathfrak{v}f);\tag{10.137}
$$

$$
\eta^\*(f) = c^\*(\mathfrak{u}f) + c(\mathfrak{v}f),
\tag{10.138}
$$

*which extends to a linear map* α : CAR(*H*) → CAR(*H*)*, where* η(*f*) = α(*c*(*f*)) *etc. Then* α *is a homomorphism of C\*-algebras, or, equivalently, one has the CAR*

$$[\mathfrak{q}(f), \mathfrak{q}^\*(\mathfrak{g})]\_+ = \langle f, \mathfrak{g} \rangle\_H \cdot \mathbf{1}\_H;\tag{10.139}$$

$$[\eta(f), \eta(g)]\_{+} = [\eta^\*(f), \eta^\*(g)]\_{+} = 0,\tag{10.140}$$

*iff* u *and* v *satisfy* (10.133) *-* (10.134)*, with u* u, *v* v*. Moreover,* α *is invertible (and hence defines an automorphism of* CAR(*H*)*) iff in addition* (10.135) *-* (10.136) *are valid (again with with u* u, *v* v*), in which case the inverse is*

$$c(f) = \eta(\mathfrak{u}^\* f) + \mathfrak{n}^\*(\mathfrak{v}^\* f);\tag{10.141}$$

$$c^\*(f) = \eta^\*(\mathfrak{u}^\*f) + \eta(\mathfrak{v}^\*f). \tag{10.142}$$

Note that anti-linearity of v is needed to make *f* → η(*f*) anti-linear, like *f* → *c*(*f*). With respect to a base (*ei*) of *H*, the transformations (10.137) - (10.142) reads

$$
\eta\_i = \sum\_j (\overline{u}\_{ji}c\_j + \nu\_{ji}c\_j^\*);\tag{10.143}
$$

$$
\eta\_j^\* = \sum\_j (\mu\_{j\bar{\imath}} c\_j^\* + \overline{\nu}\_{j\bar{\imath}} c\_j);\tag{10.144}
$$

$$c\_i = \sum\_j (\mu\_{ij}\eta\_j + \overline{\nu}\_{ij}\eta\_j^\*);\tag{10.145}$$

$$c\_i^\* = \sum\_j (\overline{u}\_{lj}\eta\_j^\* + \upsilon\_{lj}\eta\_j). \tag{10.146}$$

*Proof.* The proof is a straightforward computation. -

In comparison with the preceding diagonalization process, where *H* = C*N*, we notice that in this process *u* and *v* were both linear, whereas in Proposition 10.12 u is linear whereas v is antilinear. This difference is easily overcome by taking u = *u* and <sup>v</sup> <sup>=</sup> *Jv*, where *<sup>J</sup>* : <sup>C</sup>*<sup>N</sup>* <sup>→</sup> <sup>C</sup>*<sup>N</sup>* is the anti-linear map *J f*(*x*) = *<sup>f</sup>*(*x*), so that *<sup>J</sup>* is a *conjugation* in being an anti-linear map that satisfies *J*<sup>∗</sup> = *J*−<sup>1</sup> = *J*.

Returning to our generic Hamiltonian (10.111), a straightforward computation using (10.145) - (10.146), (10.116), (10.129), and (10.133) - (10.136) yields

$$h\_N = \sum\_{k=1}^N \varepsilon\_k \eta\_k^\* \eta\_k,\tag{10.147}$$

up to a (computable) constant, where we recall that ε*<sup>k</sup>* ≥ 0 (*k* = 1,...,*N*). Note that *hN* is still defined on the fermionic Fock space *<sup>F</sup>*−(C*N*), as *hN* is a (complicated) *quadratic* expression in the operators *ci* and *c*<sup>∗</sup> *<sup>i</sup>* on *<sup>F</sup>*−(C*N*). The point is that (as a consequence of Proposition 10.12) the η*<sup>k</sup>* and η<sup>∗</sup> *<sup>k</sup>* also satisfy the CAR, i.e.,

$$[\eta\_l, \eta\_j^\*]\_+ = \delta\_{lj} \cdot 1\_{F\_-(H)};\tag{10.148}$$

$$[\eta\_l, \eta\_f]\_+ = [\eta\_l^\*, \eta\_f^\*]\_+ = 0. \tag{10.149}$$

Theorem 10.13. *Let A* = CAR(C*N*) *be the CAR-algebra over H* = C*<sup>N</sup> with dynamics* α*t*(*a*) = *eithN ae*−*ithN given by* (10.111) *and hence by*(10.147)*. Then* α *has a unique (and hence pure and symmetric) ground state* ω0*, specified by the property*

$$
\pi\_{a\_0}(\eta(f))\varOmega\_{a\_0} = 0 \ (f \in \mathbb{C}^N). \tag{10.150}
$$

*Proof.* Recall that <sup>α</sup> defines a derivation <sup>δ</sup> : CAR(C*N*) <sup>→</sup> CAR(C*N*) defined by (9.54), which in the case at hand is simply by δ(*a*) = *i*[*hN*,*a*] (since *A* is finitedimensional, δ is bounded and hence defined everywhere). Using the identity

$$[ab,c] = a[b,c]\_+ - [c,a]\_+b,\tag{10.151}$$

as well as the relations (10.148) - (10.149), we obtain δ(η*k*) = −*i*ε*k*η*k*, and hence

$$-i\alpha\_0(\eta\_k^\*\delta(\eta\_k)) = -\alpha\_0(\eta\_k^\*\eta\_k).\tag{10.152}$$

The condition −*i*ω0(*a*∗δ(*a*)) ≥ 0, i.e., eq. (9.56) from Proposition (9.20), therefore implies that ω0(η<sup>∗</sup> *<sup>k</sup>* η*k*) ≤ 0, and hence ω0(η<sup>∗</sup> *<sup>k</sup>* η*k*) = 0 by positivity of ω0. Since *F*0(*H*) is finite-dimensional and *A* ∼= *B*(*F*0(*H*)), cf. (10.98), we may assume ground state(s) to be pure and normal, i.e., there is some unit vector ψ<sup>0</sup> ∈ *F*−(*H*) with ω(*a*) = ψ0,*a*ψ0 for each *a* ∈ *A*. Hence ψ0,η<sup>∗</sup> *<sup>k</sup>* η*k*ψ0 = 0, which enforces

$$
\eta\_k \psi\_0 = 0 \ (k = 1, \ldots, N). \tag{10.153}
$$

This property makes ψ<sup>0</sup> unique up to a phase. Indeed, together with (10.148) - (10.149), eq. (10.153) implies the values of all one- and two-point functions, i.e.,

$$a\_0(\eta(f)) = a\_0(\eta^\*(f)) = 0;\tag{10.154}$$

$$a\_0(\eta^\*(f)\eta(\mathbf{g})) = a\_0(\eta^\*(f)\eta^\*(\mathbf{g})) = a\_0(\eta(f)\eta(\mathbf{g})) = 0; \quad (10.155)$$

$$
\mathfrak{a}\_{\mathbf{0}}(\mathfrak{\eta}(f)\mathfrak{\eta}^\*(\mathbf{g})) = \langle f, \mathfrak{g} \rangle\_{\mathcal{H}}.\tag{10.156}
$$

Furthermore, the value of ω<sup>0</sup> on any product of an odd number of η(*f*) and η∗(*g*) vanishes; for an even number the value ω0(∏*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> η(*fi*)∏*<sup>n</sup> <sup>j</sup>*=<sup>1</sup> η∗(*gj*)) it is given by

$$\sum\_{p=1}^{n}(-1)^{n-p}a\mathbf{o}\_{0}(\eta(f\_{1})\eta^{\*}(g\_{p})a\mathbf{o}\_{0}\left(\prod\_{i=2}^{n}\eta(f\_{i})\prod\_{j=1,j\neq p}^{n}\eta^{\*}(g\_{j})\right).$$

Hence (10.153) gives <sup>ω</sup><sup>0</sup> on all of CAR(C*N*). Since CAR(*H*) = *<sup>B</sup>*(*F*−(*H*)), this fixes ψ<sup>0</sup> up to a phase. Eq. (10.150) is just a fancy way of rewriting (10.153). -

By construction, the ground state energy of (10.147) is zero. In connection with our approach to SSB via Butterfield's Principle it is of interest to compute the energy ε<sup>1</sup> of the first excited state. This may be done from (10.120) - (10.121) with (10.122) and the specific expression (10.118) for the quantum Ising chain. Thus we solve

$$\mathcal{A}\,\Psi\_k(\mathbf{x}) + \Psi\_k(\mathbf{x}+1) = \varepsilon\_k \Phi\_k(\mathbf{x}) \; (\mathbf{x} = 1, \dots, N, \Psi\_k(N+1) = 0); \quad (10.157)$$

$$
\lambda \, \varphi\_k(\mathbf{x}) + \varphi\_k(\mathbf{x} - 1) = \varepsilon\_k \, \varphi\_k(\mathbf{x}) \; (\mathbf{x} = 1, \dots, N, \varphi\_k(\mathbf{0}) = \mathbf{0}).\tag{10.158}
$$

A solution of this system (with real wave-functions and positive energy) is given by

$$\mathfrak{g}\_k(\mathbf{x}) = \mathbf{C}(-1)^k \sin(q\_k(\mathbf{x} - N - 1));\tag{10.159}$$

$$
\Psi\_k(\mathbf{x}) = -C \sin(q\_k \mathbf{x});\tag{10.160}
$$

$$\mathfrak{e}\_{k} = \sqrt{1 + \mathcal{X}^{2} + 2\mathcal{X}\cos(q\_{k})},\tag{10.161}$$

where *C* > 0 is a normalization constant, and *qk* should be solved from

$$(N+1)q\_k = (k-1)\pi + \arctan\left(\frac{\sin q\_k}{\lambda + \cos q\_k}\right). \tag{10.162}$$

For example, for λ = 0 (i.e. no transverse magnetic field) we obtain *qk* = *k*π/*N*, where *k* = 1,...,*N*. For λ > 1 there is a unique real solution *qk* for each *k*, too, and even as *N* → ∞ there is an energy gap ε*<sup>k</sup>* > 0 for each *k*. For 0 < λ < 1, however, there is a *complex* solution *q*<sup>1</sup> = π +*i*ρ, where ρ ∈ R is a solution to

$$\tanh((N+1)\rho) = \frac{\sinh \rho}{\cosh \rho - \lambda}.\tag{10.163}$$

As *<sup>N</sup>* <sup>→</sup> <sup>∞</sup>, we find <sup>ρ</sup> <sup>=</sup> <sup>−</sup>ln(λ)−(1−λ2)λ2(*N*−1) . Eq. (10.161) then gives

$$
\mathfrak{E}(q\_1) \approx (1 - \lambda^2) \lambda^N \ (N \to \ast \ast), \tag{10.164}
$$

which, recalling that *E*(1) *<sup>N</sup>* <sup>=</sup> <sup>ε</sup><sup>1</sup> and *<sup>E</sup>*(0) *<sup>N</sup>* = 0 and hence Δ*<sup>N</sup>* = ε1, confirms (10.82).

#### 10.7 Exact solution of the quantum Ising chain: *N* = ∞

The (two-sided) infinite quantum Ising chain is described by the C\*-algebra

$$F = \mathbf{CAR}(\ell^2(\mathbb{Z}));\tag{10.165}$$

one may also consider a one-sided chain, but it lacks translation symmetry. By the construction at the beginning of the previous section, *F* is isomorphic to the infinite tensor product *A* = *M*2(C)∞. We consider *F* to be generated by the operators *c*<sup>±</sup> *x* (*x* ∈ Z), where *c*<sup>−</sup> *<sup>x</sup>* <sup>≡</sup> *cx* and *<sup>c</sup>*<sup>+</sup> *<sup>x</sup>* ≡ *c*<sup>∗</sup> *<sup>x</sup>* . In this notation, the CAR (10.95) - (10.96) read

$$[c\_{\mathbf{x}}^{\pm}, c\_{\mathbf{y}}^{\mp}]\_{+} = \delta\_{\mathbf{x}\mathbf{y}};\tag{10.166}$$

$$[c\_x^{\\\pm}, c\_y^{\\\pm}]\_+ = 0.\tag{10.167}$$

Although the local Hamiltonians (10.111) do not have a limit as *N* → ∞, as explained in §10.5 they do generate a time-evolution on *F* in the sense of a continuous homomorphism α : R → Aut(*F*) via (10.65) and (10.67); see also Theorem 9.15.

Let us first extend the approach in the previous section to *N* = ∞, in which case C*<sup>N</sup>* is replaced by *H* = -<sup>2</sup>(Z), assuming the theory has already been brought into fermionic form with local Hamiltonians (10.111) (as we will see, it is this step, i.e., the Jordan–Wigner transformation, that marks the difference between *N* < ∞ and *N* = ∞). Thus we define operators *A* : -<sup>2</sup>(Z) <sup>→</sup> -<sup>2</sup>(Z) and *B* : -<sup>2</sup>(Z) <sup>→</sup> -<sup>2</sup>(Z) as the obvious extensions of the *N* × *N* matrices *A* and *B* to operators on -<sup>2</sup>(Z), and similarly *S* : -<sup>2</sup>(Z) <sup>→</sup> -<sup>2</sup>(Z) is the "full" shift operator, defined by (*S f*)(*x*) = *f*(*x*+1). Instead of the somewhat clumsy explicit solution procedure sketched in the previous section for *N* < ∞, we may now simply rely on the Fourier transformation

$$\mathcal{A}: \ell^2(\mathbb{Z}) \to L^2([-\pi, \pi]);\tag{10.168}$$

$$((\mathcal{F}f)(k) \equiv \hat{f}(k) \;=\sum\_{\mathbf{x} \in \mathbb{Z}} e^{-ik\mathbf{x}} f\_{\mathbf{j}};\tag{10.169}$$

$$(\mathcal{J}^{-1}\hat{f})(\mathbf{x}) \equiv f(\mathbf{x}) = \int\_{-\pi}^{\pi} \frac{dk}{2\pi} e^{ik\mathbf{x}} \hat{f}(k),\tag{10.170}$$

which diagonalizes *<sup>A</sup>* and *<sup>B</sup>* to operators *<sup>A</sup>*ˆ,*B*<sup>ˆ</sup> : *<sup>L</sup>*2([−π,π]) <sup>→</sup> *<sup>L</sup>*2([−π,π]). For the quantum Ising Hamiltonian (10.110) these are given by the multiplication operators

$$
\hat{A}\Psi(k) = -(\cos k + \lambda)\Psi(k);\tag{10.171}
$$

$$
\mathcal{B}\Psi(k) = -i\sin k \,\Psi(k). \tag{10.172}
$$

For fixed *k*, the eigenvalues and eigenvectors of the 2×2 matrix

$$M\_k = \begin{pmatrix} -(\cos k + \lambda) & -i \sin k \\ i \sin k & \cos k + \lambda \end{pmatrix},\tag{10.173}$$

are ±ε*k*, given by (10.161) with *qk k*. It is then routine to find a unitary 2×2 matrix *Uk* = *uk vk vk uk* that diagonalizes *Mk* in the sense that *U*−<sup>1</sup> *<sup>k</sup> MkUk* = −ε*<sup>k</sup>* <sup>0</sup> 0 ε*<sup>k</sup>* . Fourier transforming these multiplication operators back to -<sup>2</sup>(Z) then yields an operator *U* on -<sup>2</sup>(Z)⊕-<sup>2</sup>(Z) that satisfies (10.129). This yields a unique ground state ω<sup>0</sup> characterized by a property like (10.150) or (10.153), where

$$\eta(\hat{f}) = \int\_{-\pi}^{\pi} \frac{dk}{2\pi} \hat{f}(k)(\mu\_k \hat{c}\_k + \nu\_k \hat{c}\_{-k}^\*);\tag{10.174}$$

$$
\hat{c}\_k = \sum\_{j \in \mathbb{Z}} e^{-ijk} c\_j;\tag{10.175}
$$

$$
\hat{c}\_k^\* = \sum\_{j \in \mathbb{Z}} e^{ijk} c\_j^\*. \tag{10.176}
$$

In summary, one-dimensional fermionic models with quadratic Hamiltonians like (10.111) have a unique ground state even at *N* = ∞. Thus one wonders where SSB in the quantum Ising chain could possibly come from. We will answer this question.

Almost every argument to follow relies on Z2-symmetry. In general, a Z2-action on a C\*-algebra *<sup>A</sup>* corresponds to an automorphism <sup>θ</sup> : *<sup>A</sup>* <sup>→</sup> *<sup>A</sup>* such that <sup>θ</sup><sup>2</sup> <sup>=</sup> id*A*, i.e. θ represents the nontrivial element of Z2. For example, define θ : *F* → *F* by

$$\theta(c\_{\chi}^{\pm}) = -c\_{\chi}^{\pm} \ (j \in \mathbb{Z}),\tag{10.177}$$

which is an example of a Bogoliubov transformation (cf. Proposition 10.12) and hence extends to an automorphism of *F* (which implies that θ(1*F*) = 1*F*). Clearly, θ<sup>2</sup> = id*F*, and in addition each local Hamiltonian (10.111) is invariant under θ; by implication, so is the dynamics α, i.e., α*<sup>t</sup>* ◦ θ = θ ◦α*<sup>t</sup>* for all *t* ∈ R.

A C\*-algebra *A* carrying a Z2-action decomposes as

$$A = A\_{+} \oplus A\_{-} ; \tag{10.178}$$

$$A\_{\pm} = \{ a \in A \mid \theta(a) = \pm a \},\tag{10.179}$$

where the *even* part *A*<sup>+</sup> is a subalgebra of *A*, whereas the *odd* part *A*<sup>−</sup> is not: one has *ab* ∈ *A*<sup>+</sup> for *a*,*b* both in either *A*<sup>+</sup> or *A*−, and *ab* ∈ *A*<sup>−</sup> if one is in *A*<sup>+</sup> and the other in *A*−. For example, if *A* = *B*(*H*) for some Hilbert space *H* and *w* : *H* → *H* is a untitary operator satisfying *w*<sup>2</sup> = 1 (and hence *w*<sup>∗</sup> = *w*), then

$$\theta(a) = \bowtie{a} ww^\* \left(= \bowtie{a} ww\right) \tag{10.180}$$

defines a Z2-action on *A*. In that case, *A*<sup>+</sup> and *A*<sup>−</sup> consist of all *a* ∈ *A* that commute and anticommute with *w*, respectively, that is,

$$A\_{\pm} = \{ a \in A \mid aw \mp wa = 0 \}. \tag{10.181}$$

In case of (10.165) with (10.177), the subspace *F*<sup>+</sup> (*F*−) is just the linear span of all products of an even (odd) number of *c*± *<sup>j</sup>* 's.

Let us move to Theorem C.90 and reconsider the proof of the claim that if πω(*A*) = C · 1, then ω is mixed. If the commutant πω(*A*) is nontrivial, then it contains a nontrivial projection *e*<sup>+</sup> ∈ πω(*A*) . It then follows that *e*+Ωω = 0: for if *e*+Ωω = 0, then *ae*+Ωω = *e*+*a*Ωω = 0 for all *a* ∈ *A*, so that *e*<sup>+</sup> = 0, since πω is cyclic. Similarly, *e*−Ωω = 0 with *e*<sup>−</sup> = 1*<sup>H</sup>* −*e*+, so we may define the unit vectors

$$\mathfrak{Q}\_{\pm} = e\_{\pm} \mathfrak{Q}\_{\mathfrak{O}} / \left| \left| e\_{\pm} \mathfrak{Q}\_{\mathfrak{O}} \right| \right|, \tag{10.182}$$

and the associated states ω±(*a*) = Ω±,πω(*a*)Ω±on *A*. This yields a convex decomposition <sup>ω</sup> <sup>=</sup> λω<sup>+</sup> + (1−λ)ω−, with <sup>λ</sup> <sup>=</sup> Ω−2. Since <sup>λ</sup> <sup>=</sup> <sup>0</sup>,1 and <sup>ω</sup><sup>+</sup> <sup>=</sup> <sup>ω</sup>−, it follows that ω is mixed. The associated reduction is effected by writing

$$H = H\_+ \oplus H\_-;\tag{10.183}$$

$$H\_{\pm} = e\_{\pm}H,\tag{10.184}$$

in that *A* (more precisely, πω(*A*)) maps each subspace *H*<sup>±</sup> into itself. Now pass from the projections *e*<sup>±</sup> to the operator *w* = *e*<sup>+</sup> −*e*−, which by construction satisfies

$$
\omega^\* = \omega^{-1} = \text{w.}\tag{10.185}
$$

In particular, *w* is unitary. Conversely, if some unitary *w* satisfies *w*<sup>2</sup> = 1*H*, then

$$e\_{\pm} = \frac{1}{2} (1\_H \pm w) \tag{10.186}$$

are projections satisfying *e*<sup>+</sup> +*e*<sup>−</sup> = 1*H*, giving rise to the decomposition (10.184). Group-theoretically, this means that one has a unitary Z2-action on *H* ≡ *H*ω, in which the nontrivial element of Z<sup>2</sup> = {−1,1} is represented by *w*. The decomposition (10.184) then simply means that Z<sup>2</sup> acts trivially on *H*<sup>+</sup> (in that both group elements are represented by the unit operator) and acts nontrivially on *H*<sup>−</sup> (in that the nontrivial element is represented by *minus* the unit operator). In conclusion, one has a Z<sup>2</sup> perspective on the reduction of *H*ω, and instead of a projection *e* ∈ πω(*A*) one may equivalently look for an operator *w* ∈ πω(*A*) that satisfies (10.185).

Proposition 10.14. *Suppose A carries a* Z2*-action* θ *and consider a state* ω : *A* → C *that is* Z2*-invariant in the sense that* ω(θ(*a*)) = ω(*a*) *for all a* ∈ *A. We write this as* θ∗ω = ω*, with* θ∗ω = ω ◦ θ*. Then there is a unitary operator w* : *H*<sup>ω</sup> → *H*<sup>ω</sup> *satisfying w*<sup>2</sup> <sup>=</sup> <sup>1</sup>*H, w*<sup>Ω</sup> <sup>=</sup> <sup>Ω</sup>*, and and w*πω(*a*)*w*<sup>∗</sup> <sup>=</sup> πω(θ(*a*)) *for each a* <sup>∈</sup> *A.*

Cf. Corollary 9.12. In this situation, we obtain a decomposition of *H* ≡ *H*<sup>ω</sup> according to (10.183), where the projections *e*<sup>±</sup> are given by (10.186), so that, equivalently,

$$H\_{\pm} = \{ \Psi \in H \mid \nu \Psi = \pm \Psi \} = A\_{\pm} \Omega\_{-} \,. \tag{10.187}$$

In terms of the decomposition (10.178), it is easily seen that each subspace *H*<sup>±</sup> is stable under *A*+, whereas *A*<sup>−</sup> maps *H*<sup>±</sup> into *H*∓. We denote the restriction of πω(*A*+) to *H*<sup>±</sup> by π±, so that a Z2-invariant state θ on *A* not just gives rise to the GNS-representation πω of *A* on *H*ω, but also induces two representations π<sup>±</sup> of the even part *A*<sup>+</sup> on *H*±. This leads to a refinement of Theorem C.90:

Theorem 10.15. *Suppose A carries a* Z2*-action* θ*, and let* ω : *A* → C *be a* Z2 *invariant state. With the above notation, suppose the representation* π+(*A*+) *on H*<sup>+</sup> *is irreducible. Then also the representation* π−(*A*+) *on H*<sup>−</sup> *is irreducible, and there are the following two possibilities for the representation* πω(*A*) *on H* = *H*<sup>+</sup> ⊕*H*−*:*

*1.* πω(*A*) *is irreducible (and* ω *is pure) iff* π+(*A*+) *and* π−(*A*+) *are inequivalent. 2.* πω(*A*) *is reducible (and* ω *is mixed) iff* π+(*A*+) *and* π−(*A*+) *are equivalent.*

*Proof.* The proof of this theorem is much more difficult than one would expect (given its simple statement), so we restrict ourselves to the easy steps, as well as to two examples illustrating each of the two possibilities. To start with the latter:

1. *A* = *M*2(C), with θ(*a*) = σ3*a*σ3; note that σ<sup>2</sup> <sup>3</sup> = 1 and σ<sup>∗</sup> <sup>3</sup> = σ3. Then

$$A\_+ = \left\{ \begin{pmatrix} z\_+ & 0 \\ 0 & z\_- \end{pmatrix}, z\_\pm \in \mathbb{C} \right\} \equiv D\_2(\mathbb{C});\tag{10.188}$$

$$A\_{-}=\left\{ \begin{pmatrix} 0 \ z\_{1} \\ z\_{2} \ 0 \end{pmatrix}, z\_{1}, z\_{2} \in \mathbb{C} \right\},\tag{10.189}$$

where *Dn*(C) denotes the C\*-algebra of diagonal *n*×*n* matrices. Take Ω = (1,0), with associated state

$$a\mathfrak{o}(a) = \langle \mathfrak{Q}, a\mathfrak{Q} \rangle,\tag{10.190}$$

where *a* ∈ *M*2(C). It follows from §2.4 that the associated GNS-representation πω(*A*) is just (equivalent to) the defining representation of *M*2(C) on *H*<sup>ω</sup> = C2, in which the cyclic vector Ωω of the GNS-construction is Ω itself. Since σ3Ω = Ω, the state defined by (10.190) is Z2-invariant, and the unitary operator *w* in Proposition 10.14 is simply *w* = σ3. Hence the decomposition (10.183) of *H* = <sup>C</sup><sup>2</sup> is simply <sup>C</sup><sup>2</sup> <sup>=</sup> <sup>C</sup>⊕C, i.e.,

$$H\_+ = \{(z, 0), z \in \mathbb{C}\};\tag{10.191}$$

$$H\_- = \{(0, z), z \in \mathbb{C}\}.\tag{10.192}$$

Of course, we then have *H*<sup>±</sup> = *A*±Ω. Identifying *H*<sup>±</sup> ∼= C, this gives the onedimensional representations π±(*D*2(C)) as

$$
\pi\_{\pm} \begin{pmatrix} z\_{+} & 0 \\ 0 & z\_{-} \end{pmatrix} = z\_{\pm}, \tag{10.193}
$$

which are trivially inequivalent. Hence by Theorem 10.15 the defining representation of *M*2(C) on C<sup>2</sup> is irreducible, as it should be.

2. *A* = *D*2(C), with

$$\theta(\text{diag}(z\_+, z\_-)) = \text{diag}(z\_-, z\_+), \tag{10.194}$$

where we have denoted the matrix in (10.188) by diag(*z*+,*z*−). This time,

$$A\_{\pm} = \{ \text{diag}(z, \pm z), z \in \mathbb{C} \}. \tag{10.195}$$

We once again define a Z2-invariant state ω by (10.190), but this time we take

10.7 Exact solution of the quantum Ising chain: *N* = ∞ 401

$$
\mathfrak{Q} = \frac{1}{\sqrt{2}} \begin{pmatrix} 1 \\ 1 \end{pmatrix}. \tag{10.196}
$$

Hence

$$H\_{\pm} = \{ (z, \pm z), z \in \mathbb{C} \} \dots \tag{10.197}$$

We may now identify each *A*<sup>±</sup> with C under the map diag(*z*,±*z*) → *z* from *A*<sup>±</sup> to C. Similarly, we identify each each subspace *H*<sup>±</sup> with C under the map *H*<sup>±</sup> → C defined by (*z*,±*z*) → *z*. Under these identifications, we have two onedimensional representations π<sup>±</sup> of the C\*-algebra C on the Hilbert space C, given by π±(*z*) = *z*. Clearly, these are equivalent: they are even identical. Hence by Theorem 10.15 the defining representation of *D*2(C) on C<sup>2</sup> is reducible, as it should be: the explicit decomposition of C<sup>2</sup> in *D*2(C)-invariant subspaces is just the one (10.191) - (10.192) of the previous example.

The first-numbered claim of Theorem 10.15 is relatively easy to prove from Theorem C.90. Suppose π±(*A*+) are inequivalent and take *b* ∈ πω(*A*) : we want to show that *b* = λ · 1 for some λ ∈ C. Relative to *H* = *H*<sup>+</sup> ⊕*H*−, we write

$$b = \begin{pmatrix} b\_{++} \ b\_{+-} \\ b\_{-+} \ b\_{--} \end{pmatrix},\tag{10.198}$$

where the four operators in this matrix act as follows:

$$b\_{++}: H\_+ \to H\_+, b\_{+-}: H\_- \to H\_+, b\_{-+}: H\_+ \to H\_-, b\_{--}: H\_- \to H\_-. \tag{10.199}$$

Since *A*<sup>+</sup> ⊂ *A*, we also have *b* ∈ πω(*A*+) . The condition [*b*,*a*] = 0 for each *a* ∈ *A*<sup>+</sup> is equivalent to the four conditions

$$[b\_{++}, \pi\_+(a)] = 0;\tag{10.200}$$

$$[b\_{--}, \pi\_{-}(a)] = 0;\tag{10.201}$$

$$
\pi\_+(a)b\_{+-} = b\_{+-}\pi\_-(a);\tag{10.202}
$$

$$
\pi\_-(a)b\_{-+} = b\_{-+}\pi\_+(a). \tag{10.203}
$$

We now use the fact (which we state without proof) that, as in group theory, the irreducibility and inequivalence of π±(*A*+) implies that there can be no nonzero operator *c* : *H*<sup>+</sup> → *H*<sup>−</sup> such that *c*π+(*a*) = π−(*a*)*c* for all *a* ∈ *A*+, and vice versa. Hence *b*+<sup>−</sup> = 0 as well as *b*−<sup>+</sup> = 0. In addition, the irreducibility of π±(*A*+) implies that *b*++ = λ<sup>+</sup> · 1*H*<sup>+</sup> and *b*−− = λ<sup>−</sup> · 1*H*<sup>1</sup> . Finally, the property [*b*,*a*] = 0 for each *a* ∈ *A*<sup>−</sup> implies λ<sup>+</sup> = λ−. Hence *b* = λ · 1, and πω(*A*) is irreducible.

To prove the second-numbered claim of Theorem 10.15, let π+(*A*+) ∼= π−(*A*+), so by definition (of equivalence) there is a unitary operator *v* : *H*<sup>−</sup> → *H*<sup>+</sup> such that

$$\text{v } \mathfrak{m}\_{-}(a) = \mathfrak{m}\_{+}(a)\mathfrak{v}, \forall a \in A\_{+}. \tag{10.204}$$

Extend *v* to an operator *w* : *H* → *H* by

$$
\hat{\mathbf{v}} \cdot \mathbf{w} = \begin{pmatrix} \mathbf{0} & \nu \\ \nu^\* \ \mathbf{0} \end{pmatrix}. \tag{10.205}
$$

It is easy to verify from (10.204) that [*w*,π(*a*)] = 0 for each *a* ∈ *A*+. To check that the same is true for each *a* ∈ *A*−, one needs the difficult analytical fact that *w* is a (weak) limit of operators of the kind π(*an*), where *an* ∈ *A*−, which also implies that *w*∗π(*a*) ∈ π(*A*+). Since π(*A*+) = π(*A*+) and *w* ∈ π(*A*+) , we obtain [*w*∗π(*a*),*w*] = 0 for each *a* ∈ *A*−. But for unitary operators *w* this is the same as [*w*,π(*a*)] = 0. So *w* ∈ π(*A*) , and hence π(*A*) is reducible by Theorem C.90. -

In determining the ground state(s) of the quantum Ising chain, we will apply Theorem 10.15 to the C\*-algebra (10.87). This application relies on the representation theory of *F*. For the moment we leave the Hilbert space *H* general, equipped though with a conjugation *J* : *H* → *H*. It turns out to be convenient to use the *self-dual formulation of the CAR*, which treats *c* and *c*∗ on an equal footing. Define

$$K = H \oplus H,\tag{10.206}$$

whose elements are written as *h* = (*f*,*g*) or *h* = *f*+˙ *g*, with inner product

$$
\langle h\_1, h\_2 \rangle\_K = \langle f\_1, f\_2 \rangle\_H + \langle \mathbf{g}\_1, \mathbf{g}\_2 \rangle\_H. \tag{10.207}
$$

We then introduce a new operator in CAR(*H*), namely the *field*

$$\Phi(h) = c^\*(f) + c(Jg),\tag{10.208}$$

which is *linear* in *h* = *f*+˙ *g*, because the antilinearity of *c*(*f*) in *f* is canceled by the antilinearity of *J*. This yields the anti-commutation relations

$$[\Phi^\*(h\_1), \Phi(h\_2)]\_+ = \langle h\_1, h\_2 \rangle\_K,\tag{10.209}$$

but be aware that generally [Φ∗(*h*1),Φ∗(*h*2)]+ and [Φ(*h*1),Φ(*h*2)]+ do not vanish. Indeed, in terms of the antilinear operator Γ : *K* → *K*, defined by

$$
\Gamma = \begin{pmatrix} 0 \ J \\ J \ 0 \end{pmatrix} \tag{10.210}
$$

we have the following expression for the adjoint Φ(*h*)<sup>∗</sup> ≡ Φ∗(*h*):

$$
\Phi^\*(h) = \Phi(\Gamma h). \tag{10.211}
$$

If we identify *<sup>f</sup>* <sup>∈</sup> *<sup>H</sup>* with *<sup>f</sup>*+˙ <sup>0</sup> <sup>∈</sup> *<sup>K</sup>*, we may reconstruct *<sup>c</sup>* and *<sup>c</sup>*<sup>∗</sup> from <sup>Φ</sup> through

$$c^\*(f) = \Phi(f);\tag{10.212}$$

$$c(f) = \Phi(\Gamma f). \tag{10.213}$$

Bogoliubov transformations now take an extremely elegant form. For any unitary operator *S* on *K* that satisfies [*S*,Γ ] = 0, we define the transform Φ*<sup>S</sup>* of Φ by

10.7 Exact solution of the quantum Ising chain: *N* = ∞ 403

$$\Phi\_{\mathbb{S}}(h) = \Phi(Sh),\tag{10.214}$$

with associated creation- and annihilation operators (where *<sup>H</sup> <sup>f</sup>* <sup>≡</sup> *<sup>f</sup>*+˙ 0, as above)

$$c^\*\_{\mathcal{S}}(f) = \Phi\_{\mathcal{S}}(f);\tag{10.215}$$

$$c\_{\mathcal{S}}(f) = \Phi\_{\mathcal{S}}^\*(f). \tag{10.216}$$

To see the equivalence with the original formulation of the Bogoliubov transformation, note that for unitary *S*, the condition [*S*,Γ ] = 0 is equivalent to the structure

$$S = \begin{pmatrix} \mu & \nu J \\ J\nu \, J\mu J \end{pmatrix},\tag{10.217}$$

where *u* : *H* → *H* is linear, *v* : *H* → *H* is antilinear, and *u* and *v* satisfy (10.133) - (10.134). Moreover, from (10.137) - (10.138) we obtain

$$c\_S(f) = \mathfrak{n}(f);\tag{10.218}$$

$$c\_S^\*(f) = \eta^\*(f). \tag{10.219}$$

An interesting class of pure states on CAR(*H*) arises as follows.

Theorem 10.16. *There is a bijective correspondence between:*

• *Projections e* : *<sup>K</sup>* <sup>→</sup> *K that (apart form the properties e*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *e) satisfy*

$$
\Gamma e\Gamma = \mathbf{1}\_K - e;\tag{10.220}
$$

• *States* ω*<sup>e</sup> on F that satisfy*

$$\alpha\_e(\Phi(h)^\*\Phi(h)) = \langle h, eh \rangle \,\forall h \in K. \tag{10.221}$$

*Such a state* ω*<sup>e</sup> is automatically pure (so that the corresponding* GNS*-representation* π*<sup>e</sup> is irreducible), and is explicitly given by*

$$
\mathfrak{a}\_{\mathfrak{e}}(\Phi(h\_1)\cdots\Phi(h\_{2n+1})) = 0;\tag{10.222}
$$

$$\mathfrak{a}\_{\epsilon}(\Phi(h\_1)\cdots\Phi(h\_{2n})) = \sum\_{p \in \mathfrak{S}\_{2n}}' \text{sgn}(p) \prod\_{j=1}^{n} \langle eh\_{\text{sgn}(2j)}, \Gamma h\_{\text{sgn}(2j-1)} \rangle (10.223)$$

*the sum* Σ *is over all permutations p of* 1,...,2*n such that*

$$p(2j-1) < p(2j);\tag{10.224}$$

$$p(1) < p(\mathfrak{z}) < \dots < p(2n-1). \tag{10.225}$$

We omit the proof. Note that (10.221) is a special case of (10.223), because of (10.211). States like ω*e*, which are determined by their two-point functions, are called *quasi-free*; the ground state ω<sup>0</sup> on CAR(C*N*) constructed in the previous section is an example (one also has mixed quasi-free states, e.g. certain KMS states).

As a warm-up, we reconstruct the ground state of the free fermionic Hamiltonian on *F* using the above formalism. That is, we assume that *hN* in (10.111) reads

$$h\_N = \sum\_{\mathbf{x}=-N/2}^{N/2-1} \mathfrak{E}\_{\mathbf{x}} \mathfrak{c}\_{\mathbf{x}}^\* \mathfrak{c}\_{\mathbf{x}},\tag{10.226}$$

initially defining dynamics on *FN* = CAR(C*N*). In that case, the projection *e*<sup>0</sup> onto the second copy of *H* = C*<sup>N</sup>* in *K*, i.e.

$$e\_0 = \begin{pmatrix} 0 \ 0 \\ 0 \ 1 \end{pmatrix},\tag{10.227}$$

reproduces the ground state ω0(*a*) = 0|*a*|0, where |0 is the vector 1 ∈ C in *F*−(*H*), such that *c*(*f*)|0 = 0 for all *f* ∈ *H*. This also works for *N* = ∞, i.e., we construct dynamics on CAR(-<sup>2</sup>(Z)) from the local Hamiltonians (10.226) as indicated at the beginning of this section, and use the same formula for *e*0, this time with *H* = -<sup>2</sup>(Z).

In the more general case (10.111), we replace *e*<sup>0</sup> in (10.227) by

$$e\_0^{(S)} = S e\_0 S^{-1},\tag{10.228}$$

where *S* is given by (10.217), in which for *N* < ∞ the operators *u* and *v* were constructed in (10.131) - (10.132). This time, the associated state <sup>ω</sup>*<sup>e</sup>* (*S*) 0 ≡ ω*<sup>S</sup>* is the state called ω<sup>0</sup> in Theorem 10.13. As explained at the beginning of this section, this procedure even works for *N* = ∞ and hence *H* = -<sup>2</sup>(Z).

Having understood fermionic models with quadratic Hamiltonians, what remains to be done now is to reformulate the original quantum Ising chain, defined in terms of the local spin matrices σ*i*(*x*), in terms of the fermionic variables *cx* and *c*<sup>∗</sup> *<sup>x</sup>* . For finite *N* this was done through the Jordan–Wigner transformation (10.102) - (10.103). This time we need a similar isomorphism between *A* and *F*, where

$$A = \otimes\_{j \in \mathbb{Z}} \mathcal{M}\_2(\mathbb{C});\tag{10.229}$$

$$F = \mathbf{CAR}(\ell^2(\mathbb{Z})),\tag{10.230}$$

and hence we would need to start the sums in the right-hand side of (10.102) - (10.103) at *j* = −∞. At first sight this appears to be impossible, though, because operators like exp(π*i*∑*x*−<sup>1</sup> *<sup>y</sup>*=−<sup>∞</sup> <sup>σ</sup>+(*y*)σ−(*y*)) do not lie in *<sup>A</sup>* (whose elements have infinite tails of 2×2 unit matrices). Fortunately, this problem can be solved by adding a formal operator *T* to *A*, which plays the role of the "tail"

$$\mathbf{u}^{\star\star}T = e^{\pi i \sum\_{\gamma=-\omega}^{0} \sigma\_{+}(\mathbf{y})\sigma\_{-}(\mathbf{y})) \mathbf{v}\_{.} \tag{10.231}$$

This formal expression (to be used only heuristically) suggests the relations:

10.7 Exact solution of the quantum Ising chain: *N* = ∞ 405

$$T^2 = 1;\tag{10.232}$$

$$T^\* = T;\tag{10.233}$$

$$T a T = \theta\_{-}(a),\tag{10.234}$$

where θ<sup>−</sup> : *A* → *A* is a Z2-action defined by (algebraic) extension of

$$\theta\_{-}(\sigma\_{\pm}(\mathbf{y})) = -\sigma\_{\pm}(\mathbf{y})\ (\mathbf{y} \le \mathbf{0});\tag{10.235}$$

$$
\theta\_{-}(\mathfrak{σ}\_{\pm}(\mathfrak{y})) = \mathfrak{σ}\_{\pm}(\mathfrak{y}) \text{ ( $\mathfrak{y} > 0$ );}\tag{10.236}
$$

$$\theta\_{-}(\sigma\_{\mathbb{S}}(\mathbf{y})) = \sigma\_{\mathbb{S}}(\mathbf{y}) \text{ ( $\mathbf{y} \in \mathbb{Z}$ );}\tag{10.237}$$

$$\theta\_{-}(\sigma\_{0}(\mathbf{y}) = \sigma\_{0}(\mathbf{y}) \text{ ( $\mathbf{y} \in \mathbb{Z}$ )},\tag{10.238}$$

where σ<sup>0</sup> = 12. Formally, define an algebra extension

$$
\hat{A} = A \oplus A \cdot T,\tag{10.239}
$$

with elements of the type *a*+*bT*, *a*,*b* ∈ *A*, and algebraic relations given by (10.232) - (10.233). That is, we have

$$(a+bT)^{\*} = a^{\*} + \theta\_{-}(b^{\*})T;\tag{10.240}$$

$$(a+bT)\cdot(a'+b'T) = aa'+b\theta\_-(b')+(ab'+b\theta\_-(a'))T.\qquad(10.241)$$

Within *A*ˆ, the correct version of (10.102) - (10.103) may now be written down as

$$c\_{\boldsymbol{x}}^{\pm} = T e^{\mp \pi i \sum\_{\gamma=\times}^{0} \sigma\_{+}(\boldsymbol{y}) \sigma\_{-}(\boldsymbol{y})} \sigma\_{\boldsymbol{x}}^{\pm} \; (\boldsymbol{x} < 1);\tag{10.242}$$

$$c\_\chi^\pm = T\sigma\_1^\pm;\tag{10.243}$$

$$c\_{\chi}^{\pm} = T e^{\mp \pi i \Sigma\_{\gamma - 1}^{\chi - 1} \sigma\_{+}(\chi) \sigma\_{-}(\chi)} \sigma\_{\chi}^{\pm} \ (\times > 1),\tag{10.244}$$

with formal inverse transformation given by

$$\sigma\_{\pm}(\mathbf{x}) = T e^{\pm \pi i \sum\_{\mathbf{y}=\mathbf{x}}^{0} c\_{\mathbf{y}}^{+} c\_{\mathbf{y}}^{-}} c\_{\mathbf{x}}^{\pm} \ (\mathbf{x} < 1);\tag{10.245}$$

$$
\sigma\_{\pm}(\mathbf{x}) = T c\_1^{\pm};\tag{10.246}
$$

$$\sigma\_{\pm}(\mathbf{x}) = T e^{\pm \pi l \sum\_{\mathbf{y}=1}^{\chi-1} \sigma\_{+}(\mathbf{y}) \sigma\_{-}(\mathbf{y})} \sigma\_{\pm}(\mathbf{x}) \text{ ( $\mathbf{x} > 1$ )},\tag{10.247}$$

where this time we regard *T* as an element of the extended fermionic algebra

$$
\hat{F} = F \oplus F \cdot T,\tag{10.248}
$$

satisfying the same rules (10.232) - (10.234), but now in terms of a "fermionic" Z2 action θ*<sup>y</sup>* : *F* → *F* given by extending the following action on elementary operators:

$$\theta\_{-}(c\_{\circ}^{\pm}) = -c\_{\circ}^{\pm} \text{ ( $\chi \le 0$ );}\tag{10.249}$$

$$
\theta\_{-}(c\_{\circ}^{\pm}) = c\_{\circ}^{\pm}(\mathbf{y} > \mathbf{0}).\tag{10.250}
$$

(10.251)

Because of *T*, the Jordan–Wigner transformation does not give an isomorphism *A* ∼= *F*, but it does give an isomorphism *A*ˆ ∼= *F*ˆ. More importantly, if, having already defined the Z2-action θ on *F* by (10.177), we define a similar Z2-action on *A* by

$$\theta(\mathfrak{o}\_{\pm}(\mathfrak{y})) = -\mathfrak{o}\_{\pm}(\mathfrak{y}) \text{ ( $\mathfrak{y} \in \mathbb{Z}$ )};\tag{10.252}$$

$$\theta(\mathfrak{o}\_3(\mathfrak{y})) = \mathfrak{o}\_3(\mathfrak{y}) \text{ ( $\mathfrak{y} \in \mathbb{Z}$ )};\tag{10.253}$$

$$\theta(\mathfrak{o}\_0(\mathfrak{y})) = \mathfrak{o}\_0(\mathfrak{y}) \text{ ( $\mathfrak{y} \in \mathbb{Z}$ )},\tag{10.254}$$

and decompose *A* = *A*<sup>+</sup> ⊕*A*<sup>−</sup> and *F* = *F*<sup>+</sup> ⊕*F*−, according to this action, cf. (10.178), we have isomorphisms

$$A\_{+} \cong F\_{+};\tag{10.255}$$

$$A\_{-} \cong F\_{-}T;\tag{10.256}$$

$$A \cong F\_+ \oplus F\_- T. \tag{10.257}$$

For given dynamics (10.111), suppose ω*<sup>A</sup>* <sup>0</sup> is a Z2*-invariant* ground state on *A*. Then ω*<sup>A</sup>* <sup>0</sup> also defines a Z2-invariant ground state ω*<sup>F</sup>* <sup>0</sup> on *F* by (10.255) and ω*<sup>F</sup>* <sup>0</sup> (*f*) = 0 for all *<sup>f</sup>* <sup>∈</sup> *<sup>F</sup>*−. Conversely, a <sup>Z</sup>2-invariant ground state <sup>ω</sup>*<sup>F</sup>* <sup>0</sup> on *F* defines a state ω*<sup>A</sup>* <sup>0</sup> on *A* by (10.255) and ω*<sup>A</sup>* <sup>0</sup> (*a*) = 0 for all *a* ∈ *A*−. But *F* has a unique ground state, so:


Theorem 10.15 gives a representation-theoretical criterion deciding between these possibilities, but to apply it we need some information on the restriction of Z2 invariant quasi-free pure states on *F* to its even part *F*+. The abstract setting involves a Z2-action *W* on *K* that commutes with Γ (so that *W* is unitary, *W*<sup>2</sup> = 1, and [Γ ,*W*] = 0), which induces a Z2-action θ on *F* by linear and algebraic extension of θ(Φ(*h*)) = Φ(*W h*). A quasi-free state ω*e*, defined according to Theorem 10.16 by a projection *e* : *K* → *K* that satisfies (10.220), is then Z2-invariant iff [*W*, *e*] = 0.

In our case, this simplifies to θ(Φ(*h*)) = −Φ(*h*), so that *W* = −1, and every projection commutes with *W*. In any case, with considerable effort one can prove:

Lemma 10.17. *Given some* Z2*-action W on K, as well as a projection e* : *K* → *K satisfying* (10.220)*, such that* [*W*,Γ ]=[*W*, *e*] = 0*:*


Theorem 10.15 then leads to a lemma, which also summarizes the discussion so far.

Lemma 10.18. *1. For given* Z2*-invariant dynamics, let* ω*<sup>F</sup>* <sup>0</sup> *be the (unique,* Z2 *invariant) ground state on F* = *F*<sup>+</sup> ⊕ *F*−*. Under F*<sup>+</sup> ⊂ *F the associated* GNS*representation space H*<sup>0</sup> *decomposes as H*<sup>0</sup> = *H*<sup>+</sup> <sup>0</sup> ⊕*H*<sup>−</sup> <sup>0</sup> *, with H*<sup>±</sup> <sup>0</sup> = *F*±Ω0*, and we denote the restriction of* π0(*F*+) *to H*<sup>±</sup> <sup>0</sup> *by* π<sup>±</sup> <sup>0</sup> *. Then* π<sup>±</sup> <sup>0</sup> (*F*+) *are irreducible.*

	- *a. Then* ω*<sup>A</sup>* <sup>0</sup> *is a ground state on A. Any* Z2*-invariant ground state on A arises in this way (via F ), so that there is a unique* Z2*-invariant ground state on A.*
	- *b. The state* ω*<sup>A</sup>* <sup>0</sup> *is pure on A iff the irreducible representations* π*<sup>T</sup>* +(*F*+) *(or* π<sup>+</sup> <sup>0</sup> (*F*+)*) and* π*<sup>T</sup>* <sup>−</sup>(*F*+) *are inequivalent.*

It turns out to be difficult to directly check the (in)equivalence of π*<sup>T</sup>* ±(*F*+). Fortunately, we can circumvent this problem by passing to yet another (irreducible) representation of *F*+. We first enlarge *F* to a new algebra

$$
\hat{F} = F \oplus FT = F\_+ \oplus F\_- \oplus F\_+ T \oplus F\_- T,\tag{10.258}
$$

and extend the state ω*<sup>F</sup>* <sup>0</sup> on *<sup>F</sup>* to a state <sup>ω</sup>ˆ0 on *<sup>F</sup>*<sup>ˆ</sup> by putting <sup>ω</sup>ˆ0(*FT*) = 0, so that <sup>ω</sup>ˆ0 is nonzero only on *<sup>F</sup>*<sup>+</sup> <sup>⊂</sup> *<sup>F</sup>*ˆ. Let <sup>π</sup>ˆ0 be the associated GNS-representation of *<sup>F</sup>*<sup>ˆ</sup> on the Hilbert space *H*ˆ0 = *F*ˆΩˆ . Under πˆ(*F*+) this space decomposes as

$$
\hat{H}\_0 = \overline{F\_+\hat{\Omega}\_0} \oplus \overline{F\_-\hat{\Omega}\_0} \oplus \overline{F\_+T\hat{\Omega}\_0} \oplus \overline{F\_-T\hat{\Omega}\_0},\tag{10.259}
$$

with corresponding restrictions <sup>π</sup>ˆ±(*F*+) and <sup>π</sup>ˆ*<sup>T</sup>* <sup>±</sup>(*F*+); more precisely, <sup>π</sup>ˆ<sup>±</sup> is the restriction of <sup>π</sup>ˆ(*F*+) to *<sup>F</sup>*±Ω<sup>ˆ</sup> 0, whilst <sup>π</sup>ˆ*<sup>T</sup>* <sup>±</sup> is is the restriction of <sup>π</sup>ˆ(*F*+) to *<sup>F</sup>*±*T*Ω<sup>ˆ</sup> 0. Clearly, πˆ±(*F*+) is the same as π<sup>±</sup> <sup>0</sup> (*F*+), and πˆ*<sup>T</sup>* <sup>−</sup>(*F*+) is just our earlier <sup>π</sup>*<sup>T</sup>* <sup>−</sup>(*F*+), but πˆ*T* +(*F*+) is new. To understand the latter, we rewrite (10.259) as

$$
\hat{H}\_0 = H\_0 \oplus \hat{H}\_0^T;\tag{10.260}
$$

$$H\_0 = \underline{F\_+\hat{\mathbf{2}}\_0 \oplus F\_-\hat{\mathbf{2}}\_0} \cong \overline{F\_+\underline{\mathbf{2}}\_0} \oplus \overline{F\_-\underline{\mathbf{2}}\_0};\tag{10.261}$$

$$
\hat{H}\_0^T = F\_+ T \hat{\mathfrak{Q}}\_0 \oplus F\_- T \hat{\mathfrak{Q}}\_0,\tag{10.262}
$$

the point being that πˆ(*F*) evidently restricts to both *H*<sup>0</sup> and *H*ˆ *<sup>T</sup>* <sup>0</sup> . We know the action of πˆ(*F*) on *H*<sup>0</sup> quite well: it is the representation induced by the ground state ω0. As to *H*ˆ *<sup>T</sup>* <sup>0</sup> , we define a state ωˆ *<sup>T</sup>* <sup>0</sup> on *F* by

$$
\langle \hat{\mathfrak{H}}\_0^T(a) = \langle \hat{\mathfrak{H}}(T)\hat{\mathfrak{Q}}\_0, \hat{\mathfrak{H}}(a)\hat{\mathfrak{H}}(T)\hat{\mathfrak{Q}}\_0 \rangle\_{\hat{H}\_0} = \langle \hat{\mathfrak{Q}}\_0, \hat{\mathfrak{H}}(\theta\_-(a))\hat{\mathfrak{Q}}\_0 \rangle\_{\hat{H}\_0},\tag{10.263}
$$

where the second equality follows from (10.234). Comparing *H*<sup>0</sup> and *H*ˆ0, for all *b* ∈ *F* (and hence especially for *b* = θ−(*a*)) we simply have

$$
\langle \langle \hat{\Omega}\_0, \hbar(b)\hat{\Omega}\_0 \rangle\_{\hat{H}\_0} = \mathfrak{d}\_0(b) = \mathfrak{a}\_0^F(b), \tag{10.264}
$$

so that ωˆ *<sup>T</sup>* <sup>0</sup> = ω*<sup>F</sup>* <sup>0</sup> ◦ θ<sup>−</sup> ≡ θ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> . Decomposing the GNS-representation space *H*θ<sup>∗</sup> −ω*<sup>F</sup>* 0 of πθ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> (*F*) as *<sup>H</sup>*θ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> <sup>=</sup> *<sup>H</sup>*<sup>+</sup> θ∗ −ω*<sup>F</sup>* 0 ⊕*H*<sup>−</sup> θ∗ −ω*<sup>F</sup>* 0 , it follows that πˆ*<sup>T</sup>* +(*F*+) is the restriction of πθ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> (*F*+) to *<sup>H</sup>*<sup>+</sup> θ∗ −ω*<sup>F</sup>* 0 . Therefore, the representation πˆ(*F*) restricted to *H*ˆ *<sup>T</sup>* <sup>0</sup> is the GNS-representation πθ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> (*F*), so that in turn <sup>π</sup>ˆ*<sup>T</sup>* +(*F*+) is πθ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> (*F*+), restricted to *H*<sup>+</sup> θ∗ <sup>−</sup>ω<sup>0</sup> . Hence, further to (10.260) - (10.262), we obtain the decomposition

$$
\hat{\mathfrak{star}}(F) \cong \pi\_{a\_0^F}(F) \oplus \pi\_{\theta\_-^\* a\_0^F}(F). \tag{10.265}
$$

The point is that for the quantum Ising chain Hamiltonian (10.110), we have:


The first claim follows from Theorem 10.20 below. The third follows from Lemma 10.18 and the previous claims. The second claim is proved by repeatedly applying Theorem 10.15 to <sup>π</sup>ˆ(*F*ˆ). Given this lemma, the real issue now lies in comparing πω*<sup>F</sup>* 0 and πθ<sup>∗</sup> −ω*<sup>F</sup>* <sup>0</sup> , both as representations of *<sup>F</sup>* (as they are defined) and as representations of *F*<sup>+</sup> ⊂ *F*. This can be settled in great generality by first looking at Theorem 10.16, and thence, recalling the positive-energy projection (10.228), realizing that

$$
\pi\_{\alpha\_0^F} = \pi\_{e\_0^{(S)}}; \tag{10.266}
$$

$$
\pi\_{\theta\_-^\* a\_0^F} = \pi\_{W\_- e\_0^{(S)} W\_-}.\tag{10.267}
$$

Here *W*<sup>−</sup> : *K* → *K* is the Z2-action on *K* defining the Z2-action θ<sup>−</sup> on *F* as explained above Lemma 10.17; specifically, *W*<sup>−</sup> is the direct sum of two copies of *w*<sup>−</sup> : -<sup>2</sup>(Z) <sup>→</sup> -<sup>2</sup>(Z), defined by *<sup>w</sup>*−(*fj*) = *fj* (*<sup>j</sup>* <sup>&</sup>gt; 0) and *<sup>w</sup>*−(*fj*) = <sup>−</sup>*fj* (*<sup>j</sup>* <sup>≤</sup> 0).

Subsequently, without proof we invoke a basic result on the CAR-algebra:

Theorem 10.20. *Let e and e be projections on K that satisfy* (10.220)*. Then:*

$$\begin{array}{l} \text{1. } \mathfrak{m}\_{\epsilon}(F) \cong \mathfrak{m}\_{\epsilon'}(F) \text{ iff } e - e' \in B\_2(K);\\ \text{2. } \mathfrak{m}\_{\epsilon}^{+}(F\_{+}) \cong \mathfrak{m}\_{\epsilon'}^{+}(F\_{+}) \text{ iff } e - e' \in B\_2(K) \text{ and } \text{dim}(eK \cap (1 - e')K) \text{ is even.} \end{array}$$

If the first condition is satisfied, the dimension in the second part is finite, so that one may indeed say it is even or odd. From Lemmas 10.18 and 10.19 and Theorem 10.20, we finally obtain the phase structure of the infinite quantum Ising chain:

Theorem 10.21. *The unique* Z2*-invariant ground state* ω<sup>0</sup> *of the Hamiltonian* (10.110) *is pure (and hence forms the unique ground state) iff both of the following hold:*

$$e\_0^{(S)} - W\_- e\_0^{(S)} W\_- \in \mathcal{B}\_2(K);\tag{10.268}$$

$$\dim(e\_0^{(S)}K \cap (1 - W\_- e\_0^{(S)}W\_-)K) \text{ is even.}\tag{10.269}$$

*This is true for all* λ *with* |λ| ≥ 1*. If* |λ| < 1*, then* ω<sup>0</sup> = <sup>1</sup> <sup>2</sup> (ω<sup>+</sup> <sup>0</sup> +ω<sup>−</sup> <sup>0</sup> )*, where* ω<sup>±</sup> <sup>0</sup> *are pure and transform under the* Z2*-action* θ *as* ω<sup>±</sup> <sup>0</sup> ◦ θ = ω<sup>∓</sup> 0 *.*

#### 10.8 Spontaneous symmetry breaking in mean-field theories

We are now going to study SSB in so-called *mean-field theories*: these are quantum spin systems with Hamiltonians like the *Curie–Weiss-model* for ferromagnetism:

$$h\_{\Lambda}^{\rm CW} = -\frac{J}{2|\Lambda|} \sum\_{\mathbf{x}, \mathbf{y} \in \Lambda} \sigma\_3(\mathbf{x}) \sigma\_3(\mathbf{y}) - B \sum\_{\mathbf{x} \in \Lambda} \sigma\_1(\mathbf{x}),\tag{10.270}$$

where *J* > 0 scales the spin-spin coupling, and *B* is an external magnetic field. Similar to the quantum Ising model, (10.270) has a Z2-symmetry (σ1,σ2,σ3) 7→ (σ1,−σ2,−σ3), which at each site *x* is implemented by *u*(*x*) = σ1(*x*). This model differs from its short-range counterpart (9.42), i.e, the quantum Ising model, or the Heisenberg model (9.44), in that every spin now interacts with every other spin. It falls into the class of *homogeneous mean-field theories*, which are defined by a single-site Hilbert space *H<sup>x</sup>* = *H* = C *n* and local Hamiltonians of the type

$$h\_{\Lambda} = |\Lambda| \tilde{h}(T\_0^{(\Lambda)}, T\_1^{(\Lambda)}, \dots, T\_{n^2 - 1}^{(\Lambda)}). \tag{10.271}$$

Here *T*<sup>0</sup> = 1*n*, and the matrices (*Ti*) *n* <sup>2</sup>−1 *i*=1 in *Mn*(C) form a basis of the real vector space of traceless self-adjoint *n*×*n* matrices; the latter may be identified with *i* times the Lie algebra su(n) of *SU*(*n*), so that (*T*0,*T*1,..., ) is a basis of *i* times the Lie algebra u(n) of the unitary group *U*(*n*) on C *n* . In those terms, we define *Tn* 2−1

$$T\_i^{(A)} = \frac{1}{|A|} \sum\_{\mathbf{x} \in \Lambda} T\_i(\mathbf{x}), \tag{10.272}$$

Finally, *h*˜ is a polynomial (which is sensitive to operator ordering). For example, to cast (10.270) (with *J* = 1) in the form (10.271), take *n* = 2, *T<sup>i</sup>* = <sup>1</sup> <sup>2</sup>σ*<sup>i</sup>* (= 1,2,3), and

$$
\tilde{h}^{\rm CW}(T\_1, T\_2, T\_3) = -\mathcal{D}(T\_3^2 + BT\_1). \tag{10.273}
$$

The assumptions of Theorem 9.15 do not hold now, and indeed the local dynamics (9.40) fails to converge to global dynamics on the quasi-local C\*-algebra *A* defined by (8.130). Fortunately, it does converge to a global dynamics on the C\* algebra *C*(*S*(*B*)), where *B* = *Mn*(C) is the single-site algebra. In order to describe the limiting dynamics of (homogeneous) mean-field models as Λ % Z *d* , we equip the state space *S*(*B*) with the Poisson structure (8.52), which we now elucidate.

For unital C\*-algebras *B*, we may regard *S*(*B*) as a *w* ∗ -compact subspace of either the complex vector space *B* <sup>∗</sup> or the real vector space *B* ∗ sa; in the latter case we regard states as linear maps ω : *B* ∗ sa → R that satisfy ω(1*B*) = 1 and ω(*a* 2 ) ≥ 0 for each *a* ∈ *B*sa. If *B* = *Mn*(C), which is all we need, we may furthermore identify *B* ∗ sa with *i*u(n) ∗ , and since the value of each state ω ∈ *S*(*Mn*(C)) is fixed on *T*<sup>0</sup> = 1*<sup>B</sup>* ∈ *i*u(n), it follows that *S*(*Mn*(C)) is a compact convex subset of *i*su(n) ∗ . In that case, the Poisson bracket (8.52) on *S*(*Mn*(C)) is none other than the restriction of (minus) the canonical Lie-Poisson bracket on su(n) <sup>∗</sup> ∼= *i*su(n) ∗ to *S*(*Mn*(C)), cf. (3.98) - (3.99). For example, for *<sup>n</sup>* <sup>=</sup> 2 we have *<sup>S</sup>*(*M*2(C)) <sup>∼</sup><sup>=</sup> *<sup>B</sup>*<sup>3</sup> <sup>⊂</sup> <sup>R</sup><sup>3</sup> by Proposition 2.9, i.e.,

$$\mathfrak{a}\_{(\mathbf{x},\mathbf{y},\mathbf{z})}(a) = \operatorname{Tr}\left(\mathfrak{\rho}(\mathbf{x},\mathbf{y},\mathbf{x})a\right) \text{ (}(\mathbf{x},\mathbf{y},\mathbf{z}) \in \mathcal{B}^3, a \in \mathcal{M}\_2(\mathbb{C})\text{)};\qquad(10.274)$$

$$\rho(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{2} \begin{pmatrix} 1+z \ \mathbf{x} - i\mathbf{y} \\ x+i\mathbf{y} \ 1-z \end{pmatrix}. \tag{10.275}$$

We also have su(2) <sup>∗</sup> ∼= R<sup>3</sup> upon the choice of the basis (*Ti* = <sup>1</sup> <sup>2</sup>σ*i*), *i* = 1,2,3, of *<sup>i</sup>*su(2), which means that <sup>θ</sup>(*x*,*y*,*z*) <sup>∈</sup> *<sup>i</sup>*su(2) <sup>∗</sup> maps (*T*1,*T*2,*T*3) to (*x*, *y*,*z*) (where this time (*x*, *<sup>y</sup>*,*z*) <sup>∈</sup> <sup>R</sup>3), cf. §5.8). If we now regard the matrices *Ti* as functions *<sup>T</sup>*<sup>ˆ</sup> *<sup>i</sup>* on *B*<sup>3</sup> by *T*ˆ *<sup>i</sup>*(ω) = ω(*Ti*), we find that the corresponding functions on *B*<sup>3</sup> are given by

$$
\hat{T}\_1(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{2}\mathbf{x}, \ \hat{T}\_2(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{2}\mathbf{y}, \ \hat{T}\_3(\mathbf{x}, \mathbf{y}, \mathbf{z}) = \frac{1}{2}\mathbf{z}.\tag{10.276}
$$

The corresponding Poisson brackets (8.52) are {*T*1,*T*2} = −2*T*<sup>3</sup> etc., i.e., {*x*, *y*} = −2*z* etc.; this is −2 times the bracket defined in (3.43) or (3.97) - (3.98). This factor 2 could have been avoided by moving to the three-ball with radius *r* = 1/2 instead of *r* = 1, whose boundary is the coadjoint orbit O1/<sup>2</sup> naturally associated to spin- <sup>1</sup> 2 .

We now return to our continuous bundle of C\*-algebras *A*(*c*) of Theorem 8.4, of course in the slightly adapted form appropriate to quantum spin systems, see §8.6. In particular, we recall that *A*(*c*) <sup>0</sup> <sup>=</sup> *<sup>C</sup>*(*S*(*B*)) and *<sup>A</sup>*(*c*) <sup>1</sup>/*<sup>N</sup>* = *B*(*H*Λ*<sup>N</sup>* ), cf. (8.157) - (8.158), and hence we see the limit *<sup>N</sup>* <sup>→</sup> <sup>∞</sup> as a specific way of taking the limit <sup>Λ</sup> \$ <sup>Z</sup>*<sup>d</sup>* along the hypercubes Λ*N*. Symmetric and quasi-symmetric sequences (*a*1/*N*)*N*∈<sup>N</sup> are defined as explained after (8.161). The following observation is fundamental.

Theorem 10.22. *Let B* = *Mn*(C)*. If* (*a*1/*N*)*N*∈<sup>N</sup> *and* (*b*1/*N*)*N*∈<sup>N</sup> *are symmetric sequences with limits a*<sup>0</sup> *and b*<sup>0</sup> *as defined by* (8.46)*, respectively (so that* (*a*1/*N*)*N*∈N˙ *and* (*b*1/*N*)*N*∈N˙ *are continuous sections of the continuous bundle A*(*c*) *), then the sequence*

$$\left( \{a\_0, b\_0\}, i[a\_1, b\_1], \dots, i[\Lambda\_N | [a\_{1/N}, b\_{1/N}], \dots \} \right) \tag{10.277}$$

*defines a continuous section of A*(*c*) *. In particular, for each* ω ∈ *S*(*B*) *we have*

$$i\lim\_{N\to\infty}\mathfrak{o}^{|\Lambda\_N|}(|\Lambda\_N|[a\_{1/N},b\_{1/N}])=\{a\_0,b\_0\}(\mathfrak{o}).\tag{10.278}$$

*Proof.* The proof is a straightforward combinatorial exercise, and we just mention the simplest case where *d* = 1 and *a*1/*<sup>N</sup>* = *S*1,*N*(*a*1) and *b*1/*<sup>N</sup>* = *S*1,*N*(*b*1), where *<sup>a</sup>*<sup>1</sup> <sup>∈</sup> *<sup>B</sup>* and *<sup>b</sup>*<sup>1</sup> <sup>∈</sup> *<sup>B</sup>*, cf. (8.39). Then *<sup>a</sup>*<sup>0</sup> <sup>=</sup> *<sup>a</sup>*ˆ1, *<sup>b</sup>*<sup>0</sup> <sup>=</sup> *<sup>b</sup>*ˆ1, and similarly to (8.45) we find

$$\mathbb{E}\left[\mathcal{S}\_{1,N}(a\_1), \mathcal{S}\_{1,N}(b\_1)\right] = \frac{1}{N} \mathcal{S}\_{1,N}([a\_1, b\_1]),\tag{10.279}$$

Using (8.52), we find that (10.277) is equal to (*i*[ *a*1,*b*1],...,*S*1,*N*([*a*1,*b*1]),...). Since ω*N*(*S*1,*N*([*a*1,*b*1])) = ω([*a*1,*b*1]), the left-hand side of (10.278) is therefore equal to *i*ω([*a*1,*b*1]), which by (8.52) equals the right-hand side. - In other words, although the sequence of commutators [*a*1/*N*,*b*1/*N*] converges to zero (which is why *A* (*c*) 0 has to be commutative!), the rescaled commutators*iN*[*a*1/*N*,*b*1/*N*] converge to the macroscopic observable {*a*0,*b*0} ∈ *C*(*S*(*B*)). This reconfirms the analogy between the limit *N* → ∞ and the limit *h*¯ → 0 of Chapter 7, see especially Definitions 7.1 and 8.2. With *B* = *Mn*(C), Theorem 10.22 implies the central result about the macroscopic (and hence classical!) dynamics of mean-field theories:

Corollary 10.23. *Let* (*h*1/*N*)*N*∈N˙ *be a continuous section of A*(*c*) *defined by a symmetric sequence, and let* (*a*1/*N*)*N*∈N˙ *be an arbitrary continuous section of A*(*c*) *(i.e. a quasi-symmetric sequence). Then, writing h*1/*<sup>N</sup>* = *h*Λ*<sup>N</sup> for clarity, the sequence*

$$\left(a\_0(t), e^{ih\_{\Lambda\_1}t}a\_1e^{-ih\_{\Lambda\_1}t}, \cdots e^{ih\_{\Lambda\_N}t}a\_{1/N}e^{-ih\_{\Lambda\_N}t}, \cdots \right),\tag{10.280}$$

*where a*0(*t*) *is the solution of the equations of motion on S*(*Mn*(C)) *with classical Hamiltonian h*<sup>0</sup> *and Poisson bracket* (8.52)*, defines a continuous section of A*(*c*) *.*

In other words, the Heisenberg dynamics on *A*Λ*<sup>N</sup>* = *B*(*H*Λ*<sup>N</sup>* ) defined by the quantum Hamiltonians *h*Λ*<sup>N</sup>* converges to the classical dynamics on the Poisson manifold *S*(*Mn*(C)) that is generated by their classical limit, viz. the Hamiltonian *h*0.

For example, since the operators *T* (Λ) *i* form symmetric sequences, so do Hamiltonians of the type (10.271). The limit *h*<sup>0</sup> ∈ *C*(*S*(*Mn*(C))) of the family (*h*<sup>Λ</sup> ) in (10.271) is simply obtained by replacing the operators *T* (Λ) *i* in the function *h*˜ by the functions *T*ˆ *<sup>i</sup>* on *S*(*Mn*(C)). Equivalently, one may replace the *T* (Λ) *i* by the canonical coordinates (θ*i*) of *i*su(n) ∗ dual to the basis (*T*1,...,*T<sup>n</sup>* 2−1 ) of *i*su(n) ∗ , i.e., θ*i*(*Tj*) = δ*i j*, and restricting the ensuing function on *i*su(n) ∗ to *S*(*Mn*(C)) ⊂ *i*su(n) ∗ .

Using (10.276), for the Curie–Weiss model (10.270) with *J* = 1 this gives

$$h\_0^{\rm CW}(\mathbf{x}, \mathbf{y}, \mathbf{z}) = -\frac{1}{2}\mathbf{z}^2 - \mathbf{B}\mathbf{x}.\tag{10.281}$$

The ground states of this Hamiltonian are simply its minima, viz.

$$\mathbf{x}\_{\pm} = (\mathcal{B}, \mathbf{0}, \pm\sqrt{1 - \mathcal{B}^2}) \text{ ( $0 \le \mathcal{B} < 1$ );} \tag{10.282}$$

$$\mathbf{x} = (1,0,0) \ (\mathbf{B} \ge 1),\tag{10.283}$$

all of which lie on the boundary *S* <sup>2</sup> of *B* 3 . Note that the points x<sup>±</sup> coalesce as *B* → 1, where they form a saddle point. Modulo our use of radius *r* = 1 instead of *r* = 1/2, this result coincides with (10.81) for classical limit of the quantum Ising model.

We now turn to symmetry and its possible breakdown. Suppose there is some subgroup of *U*(*n*), typically the image of a unitary representation *g* 7→ *u<sup>g</sup>* of a compact group *G* on C *n* , under which *<sup>h</sup>*˜(*T*0,*T*1,...,*T<sup>n</sup>* 2−1 ) in (10.271) satisfies

$$
\tilde{h}(T\_0, \mu\_g T\_1 \mu\_g^\*, \dots, \mu\_g T\_{n^2 - 1} \mu\_g^\*) = \tilde{h}(T\_0, T\_1, \dots, T\_{n^2 - 1}) \tag{10.284}
$$

For example, in the Curie–Weiss model one has *G* = Z2, whose nontrivial element is represented by σ1. For (10.271) itself this implies *u* (*N*)*hN*(*u* (*N*) ) <sup>∗</sup> = *hN*, cf. (10.69).

Hence also in homogeneous mean-field models we obtain the structure (10.57), (10.58), and (10.59) familiar from the case of short-range forces. For the limit theory this implies that the classical Hamiltonian *h*<sup>0</sup> on *S*(*Mn*(C)) is invariant under the coadjoint action of *<sup>G</sup>* <sup>⊂</sup> *<sup>U</sup>*(*n*) on *<sup>i</sup>*su(n) ∗ , restricted to *<sup>S</sup>*(*Mn*(C)) <sup>⊂</sup> *<sup>i</sup>*su(n) ∗ : in the Curie–Weiss model this "classical shadow" of the Z<sup>2</sup> symmetry of the quantum theory is simply the map (*x*, *<sup>y</sup>*,*z*) → (*x*,−*y*,−*z*) on *<sup>B</sup>*3.

In the regime 0 < *B* < 1, the degenerate ground states of this model break this symmetry. In contrast, it can be shown from the Perron–Frobenius Theorem (which applies since both σ<sup>3</sup> and σ<sup>1</sup> are real matrices) that for *B* > 0 each quantummechanical Hamiltonian (10.270) has a unique ground state ψ(0) *<sup>N</sup>* . Being unique, this vector must share the invariance of *hN* under the permutation group <sup>S</sup>*N*, so that

$$\Psi\_N^{(0)} = \sum\_{n\_+=0}^N c(n\_+/N)|n\_+, n\_-\rangle,\tag{10.285}$$

where <sup>|</sup>*n*+,*n*− is the totally symmetrized unit vector in <sup>⊗</sup>*N*C<sup>2</sup> with *<sup>n</sup>*<sup>+</sup> spins up and *n*<sup>−</sup> = *N* − *n*<sup>+</sup> spins down, and *c* : {0,1/*N*,2/*N*,...,(*N* − 1)/*N*,1} → [0,1] is Frobenius Theorem). The asymptotic behaviour of *c* as *N* → ∞ has been studied, some function such that <sup>∑</sup>*n*<sup>+</sup> *<sup>c</sup>*(*n*+/*N*)<sup>2</sup> <sup>=</sup> 1 (we may assume *<sup>c</sup>* <sup>≥</sup> 0 by the Perron– and as expected, *c* to converges pointwise to *c*(0) = *c*(1) = -1/2 and *c*(*x*) = 0, and zero elsewhere (at *B* = 0 one of course has either *c*(0) = 1 or *c*(1) = 1 for all *N*).

Thus we encounter a familiar headache: the "higher-level" theory *C*(*S*(*Mn*(C))) at *N* = ∞ breaks the Z<sup>2</sup> symmetry, whereas the "lower-level" quantum theories *B*(*H*Λ*<sup>N</sup>* ) (*N* < ∞) do not, although the former should be a limiting case of the latter. Indeed, the situation for the Curie–Weiss model in the regime 0 < *B* < 1 is exactly analogous to the double-well potential as well as to the quantum Ising model in the same regime: if the two degenerate ground states <sup>x</sup><sup>±</sup> <sup>∈</sup> *<sup>B</sup>*<sup>3</sup> of *<sup>h</sup>*CW <sup>0</sup> are reinterpreted as Dirac measures <sup>δ</sup><sup>±</sup> on *<sup>B</sup>*3, which in turn are seen as (pure) states <sup>ω</sup><sup>±</sup> on the classical algebra of observables *C*(*S*(*M*2(C))), then (10.74) holds, *mutatis mutandis*.

The resolution of this problem through the restoration of Butterfield's Principle should also be the same as for the previous two cases: there is a first excited state ψ(1) *<sup>N</sup>* such that as *N* → ∞, the energy difference with the ground state approaches zero and one has approximate symmetry breaking as in (10.75)). Alas, for the Curie– Weiss model so far only numerical evidence is available supporting this scenario.

Equilibrium states of homogeneous mean-field models at any inverse temperature 0 < β < ∞ exist, despite the fact that in such models time-evolution α*<sup>t</sup>* on the infinite system *A* (and hence the KMS condition characterizing equilibrium states) is ill-defined (unless one passes to certain representations of *A*, which would be question-begging). Instead, one invokes the quasi-local C\*-algebra *A*, cf. (8.130), and *in lieu* of KMS states looks for limit points <sup>ω</sup><sup>ˆ</sup> <sup>β</sup> <sup>∈</sup> *<sup>S</sup>*(*A*) of the local Gibbs states ωβ <sup>Λ</sup>*<sup>N</sup>* defined by (9.96) as *N* → ∞; see (10.44) and surrounding discussion. Proposition 10.8 does not apply now, but Theorem 8.9 does: since each local Hamiltonian *<sup>h</sup>*Λ*<sup>N</sup>* is permutation-invariant (because each *<sup>T</sup>*(Λ*N*) *<sup>i</sup>* is), so is each local Gibbs state ωβ <sup>Λ</sup>*<sup>N</sup>* , and accordingly, each *w*∗-limit point of this sequence must share this property. As in (8.174), from the quantum De Finetti Theorem 8.9 we therefore have:

$$
\hat{\mathfrak{w}}^{\mathcal{B}} = \int\_{S(M\_n(\mathbb{C}))} d\mu\_{\mathfrak{F}}(\theta) \left( \mathfrak{w}\_{\theta}^{\mathcal{B}} \right)^{\infty}, \tag{10.286}
$$

for some probability measure <sup>µ</sup><sup>β</sup> on the single-spin state space *<sup>S</sup>*(*Mn*(C)). By Proposition 8.28, this measure may also be regarded as a limit of the local Gibbs states, but now regarded as a state on the limit algebra *A* (*c*) <sup>0</sup> <sup>=</sup>*C*(*S*(*Mn*(C)))rather than as a state on *A* (*q*) <sup>0</sup> = *A*. By the same token, each state ω β θ in the decomposition (10.286) is a pure state on *A* (*c*) 0 (though seen as a state on *Mn*(C) it will be mixed!). The states ω β θ are computed as follows. Given a classical Hamiltonian *h*<sup>0</sup> computed from (10.271) as explained after Corollary 10.23, for each point θ = (θ0,...,θ*<sup>n</sup>* 2−1 ) ∈ *i*u(n) <sup>∗</sup> we define a new self-adjoint operator *h*ˆ <sup>θ</sup> ∈ *Mn*(C) by

$$\hat{h}\_{\theta} = h\_0(\theta) \cdot 1\_n + \sum\_{i=0}^{n^2 - 1} \frac{\partial h\_0}{\partial \theta\_i}(\theta) \cdot T\_i. \tag{10.287}$$

For example, in the Curie–Weiss model, from (10.273) we have

$$h\_0^{\rm CW}(\theta) = -\Im(\theta\_3^2 + B\theta\_1);\tag{10.288}$$

$$
\hat{h}\_{\theta}^{\rm CW} = h\_0^{\rm CW}(\theta) - 2\theta\_3 \sigma\_3 - B\sigma\_1. \tag{10.289}
$$

Eq. (10.287) has the following origin. Let ω be any state on *A* for which the strong limit *T* (ω) *i* of each operator πω(*T* (Λ*N*) *i* ) on *H*<sup>ω</sup> exists as *N* → ∞ (for example, as in the proof of Theorem 8.16 one may show that this is the case when ω is a permutationinvariant state of *A*). It easily follows that *T* (ω) *i* lies in the algebra at infinity for πω, and hence in the center of πω(*A*) <sup>00</sup>, cf. §8.5. If, in addition, ω is primary, then

$$T\_l^{(\mathfrak{a})} = \mathfrak{e}\_l \cdot \mathbf{1}\_{H\_{\mathfrak{a}}};\tag{10.290}$$

$$\Theta\_{\bar{l}} = \lim\_{N \to \infty} \mathcal{o}(T\_{\bar{l}}^{(\Lambda\_N)}). \tag{10.291}$$

Under these assumptions, we compute the commutator

$$
\left[\pi\_{\mathfrak{o}}(h\_{\Lambda\_{N}}), \pi\_{\mathfrak{o}}(a)\right] = \sum\_{i} \frac{\partial h\_{0}}{\partial \theta\_{i}} \left(T\_{0}^{(\Lambda)}, \dots, T\_{n^{2}-1}^{(\Lambda)}\right) \cdot \sum\_{\mathbf{x} \in \Lambda \mathcal{N}} \left[\pi\_{\mathfrak{o}}(T\_{i}(\mathbf{x})), \pi\_{\mathfrak{o}}(a)\right] + O\left(\frac{1}{|\Lambda\_{N}|}\right),
$$

where *a* ∈ ∪<sup>Λ</sup> *A*<sup>Λ</sup> , and *O*(1/|Λ*N*|) denotes a finite sum of (multiple) commutators between some power of *T* (Λ) *i* and operators that are (norm-) bounded in *N*. For example, for the Curie–Weiss model the *O*(1/|Λ*N*|) term is a multiple of

$$\sum\_{\boldsymbol{\chi}\in\Lambda\_{N}}[[\pi\_{\boldsymbol{\mathfrak{o}}}(\sigma\_{\boldsymbol{\mathfrak{o}}}(\boldsymbol{\chi})),\pi\_{\boldsymbol{\mathfrak{o}}}(\boldsymbol{a})],\sigma\_{\boldsymbol{\mathfrak{o}}}^{(A\_{N})}].\tag{10.292}$$

Since *a* is local, all commutators ∑*x*∈Λ*<sup>N</sup>* [πω(*Ti*(*x*)),πω(*a*)] are in πω(*A*), so that further commutators a la (10.292) vanish as ` *<sup>N</sup>* <sup>→</sup> <sup>∞</sup>. Also, in this limit the terms *<sup>T</sup>*(Λ) *i* in the argument of ∑*<sup>i</sup>* ∂*h*0/∂ θ*<sup>i</sup>* assume their *c*-number values θ*i*, so that

$$\lim\_{N \to \infty} [\pi\_{\mathfrak{o}}(h\_{\Lambda\_N}), \pi\_{\mathfrak{o}}(a)] = [h\_{\mathfrak{o}}, \pi\_{\mathfrak{o}}(a)],\tag{10.293}$$

where formally (i.e. on a suitable domain) we have an ω-dependent Hamiltonian

$$h\_{\mathfrak{w}} = \sum\_{\boldsymbol{\chi} \in \mathbb{Z}^d} \pi(\hat{h}\_{\boldsymbol{\theta}}(\boldsymbol{\chi})),\tag{10.294}$$

where the θ*<sup>i</sup>* depend on ω via (10.291). Also, for each *a* ∈ *A* one has strong limits

$$\lim\_{N \to \infty} \pi\_{\mathfrak{o}\mathfrak{o}} \left( e^{ih\_{\Lambda\_N^t}t} a e^{-ih\_{\Lambda\_N^t}t} \right) = e^{ih\_{\mathfrak{o}\mathfrak{l}}t} \pi(a) e^{-ih\_{\mathfrak{o}\mathfrak{l}}t}.\tag{10.295}$$

Hence in the limit *N* = ∞ (provided it makes sense, which it does under the stated assumptions), the original mean-field Hamiltonian (10.271) with its homogeneous long-range forces converges to a sum of single-body Hamiltonians, in which the original forces between the spins have been incorporated into the parameters θ*i*.

Returning to (10.286), for any β = *T* <sup>−</sup>1, we now determine ω<sup>β</sup> <sup>θ</sup> from the *Ansatz*

$$\alpha\_{\theta}^{\beta}(a) = \frac{\operatorname{Tr}\left(e^{-\beta\hat{h}\_{\theta}}a\right)}{\operatorname{Tr}\left(e^{-\beta\hat{h}\_{\theta}}\right)},\tag{10.296}$$

where θ is found by by solving the *self-consistency equation*

$$
\boldsymbol{\alpha}\_{\boldsymbol{\theta}}^{\beta} = \boldsymbol{\theta}.\tag{10.297}
$$

As explained after Corollary 10.23, here ω<sup>β</sup> <sup>θ</sup> : *Mn*(C)sa → R is defined by its values on *<sup>i</sup>*su(n) and hence should be seen as a map *<sup>i</sup>*su(n) <sup>→</sup> <sup>R</sup>, like <sup>θ</sup> <sup>∈</sup> su(n) ∗ , so that (10.297) consists of *<sup>n</sup>*<sup>2</sup> <sup>−</sup> 1 equations <sup>ω</sup><sup>β</sup> <sup>θ</sup> (*Ti*) = <sup>θ</sup>*<sup>i</sup>* (*<sup>i</sup>* <sup>=</sup> <sup>1</sup>,...,*n*<sup>2</sup> <sup>−</sup> 1). Alternatively, one may extend <sup>θ</sup> from *<sup>i</sup>*su(n) to *<sup>i</sup>*u(n) by prescribing <sup>θ</sup>(1*n*) = 1, and subsequently extend it further to *Mn*(C) by complex linearity. Clearly, the constant *h*0(θ) in (10.287) drops out of (5.152) and may be ignored in solving (10.297).

For example, if we take (10.289) with *B* = 0, then (10.297) forces θ<sup>1</sup> = θ<sup>2</sup> = 0, whereas the magnetization 2θ<sup>3</sup> <sup>≡</sup> *<sup>m</sup>* <sup>=</sup> <sup>ω</sup><sup>β</sup> <sup>θ</sup> (σ3) satisfies the famous *gap equation*

$$\tanh(\mathcal{B}m) = m.\tag{10.298}$$

For any β this has a solution *m* = 0, i.e., θ = 0 in *B*3, which corresponds to the tracial state ω(*a*) = <sup>1</sup> <sup>2</sup>Tr(*a*) normally associated with infinite temperature (i.e., β = 0). This state is evidently Z2-invariant. For *T* ≥ *Tc* = 1/4 (i.e. β ≤ 4) this is the only solution. For *T* < *Tc* (or β > 4), two additional solutions ±*m*<sup>β</sup> (with *m*<sup>β</sup> > 0) appear, which break the Z<sup>2</sup> symmetry. For *B* > 0 computations become tedious, but for β → ∞, where ω<sup>β</sup> <sup>θ</sup> converges to the ground state of *<sup>h</sup>*ˆ<sup>θ</sup> , one recovers our earlier conclusions.

#### Proposition 10.24. *The self-consistency equation* (10.297) *has at least one solution.*

*Proof.* This follows from Brouwer's Fixed Point Theorem (stating that any continuous map *<sup>f</sup>* from a compact compact set *<sup>K</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>k</sup>* to itself has a fixed point), applied to *K* = *S*(*Mn*(C)) and *f*(θ) = ω<sup>β</sup> <sup>θ</sup> , where θ ∈ *S*(*Mn*(C)), as just explained. -

The key result on equilibrium states of homogeneous mean-field theories, then, is:

Theorem 10.25. *Let h*<sup>Λ</sup> *in* (10.271) *define a homogeneous mean-field theory with compact symmetry group G. The sequence* (ω<sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* ) *of local Gibbs states defined by* (9.96) *and* (10.271) *has a unique G-invariant limit point* ωˆ <sup>β</sup> *, whose decomposition into primary states is given by* (10.286)*. The G-invariant probability measure* μβ *is concentrated on some G-orbit in S*(*Mn*(C))*, and the states* ω<sup>β</sup> <sup>θ</sup> *on Mn*(C) *are given by* (10.296)*, with Hamiltonians h*ˆ<sup>θ</sup> *defined by* (10.287)*, where* θ *satisfies* (10.297)*.*

*Proof.* We just sketch the proof, which is based on the Quantum De Finetti Theorem 8.9. Each operator *T*(Λ*N*) *<sup>i</sup>* is permutation-invariant, which property is transferred first to each local Hamiltonian *<sup>h</sup>*Λ*<sup>N</sup>* , thence to each local Gibbs state <sup>ω</sup><sup>β</sup> <sup>Λ</sup>*<sup>N</sup>* defined by *h*Λ*<sup>N</sup>* , and finally to each limit point of this sequence. As already noted, Theorem 8.9 then gives the decomposition (10.286), which by Theorem 8.29 (whose assumption holds in mean-field models) also gives the primary decomposition of ωˆ <sup>β</sup> (i.e., each state (ω<sup>β</sup> <sup>θ</sup> )<sup>∞</sup> is primary on the quasi-local algebra *A*). By our earlier argument centered on (10.294) - (10.295), time-evolution is implemented in the GNS-representation induced by such a state. An important step in the proof—which we omit because it requires various reformulations of the KMS condition we have not discussed—is that (ω<sup>β</sup> <sup>θ</sup> )<sup>∞</sup> satisfies the KMS condition with respect to the dynamics (10.295). This, in turn, implies (10.296), which, by definition of θ through (10.290) - (10.291), gives the self-consistency condition (10.297). The proof is completed by a tricky argument (which again uses alternatives to the KMS condition) to the effect that if some ω<sup>β</sup> <sup>θ</sup> breaks the *G*-symmetry, the probability measure μβ on the *G*-orbit in *S*(*Mn*(C)) through ω<sup>β</sup> <sup>θ</sup> induced by the normalized Haar measure on *G*, defines the only possible limit point of the local Gibbs states, and hence must be unique. -

Thus SSB can be detected by solving (10.297) and checking if the ensuing state(s) ωβ <sup>θ</sup> on *Mn*(C) is (are) *G*-invariant. As we have seen, in the Curie–Weiss model this is the case for β ≤ 4, whereas for β > 4 the measure μβ in (10.286) is given by

$$
\mu\_{\beta} = \frac{1}{2} (\delta\_{(0,0,m\_{\beta}/2)} + \delta\_{(0,0,-m\_{\beta}/2)}),
\tag{10.299}
$$

where δθ (*f*) = *f*(θ). In such cases, since each local Gibbs state is invariant, one faces the (by now) familiar threat to Earman's Principle. In response, we expect Butterfield's Principle to be restored through the introduction of asymmetric fleatype perturbations to *h*<sup>Λ</sup> that are localized in spin configuration space, although at nonzero temperature all excited states (rather than just the first) will start to play a role, and the precise details of the "flea" scenario remain to be settled.

#### 10.9 The Goldstone Theorem

So far, we have only discussed the simplest of all symmetry groups, namely *G* = Z2, which is both finite and abelian. Although it will not change our picture of SSB, for the sake of completeness (and interest to foundations) we also present a brief introduction to continuous symmetries, culminating in the Goldstone Theorem and the Higgs mechanism (which at first sight contradict each other and hence require a very careful treatment). The former results when the broken symmetry group *G* is a Lie group, whereas the latter arises when it is an infinite-dimensional gauge group.

Let us start with the simple case *G* = *SO*(2), acting on R<sup>2</sup> by rotation. This induces the obvious action on the classical phase space *T*∗R2, i.e.,

$$R(p,q) = (Rp, Rq),\tag{10.300}$$

cf. (3.94), as well as on the quantum Hilbert space *H* = *L*2(R2), that is,

$$
\mu\_R \Psi(\mathbf{x}) = \Psi(\mathbf{R}^{-1}\mathbf{x}).\tag{10.301}
$$

Let us see what changes with respect to the action of Z<sup>2</sup> on R considered in §10.1. We now regard the double-well potential*V* in (10.11) as an *SO*(2)-invariant function on R<sup>2</sup> through the reinterpretation of *x*<sup>2</sup> as *x*<sup>2</sup> 1+*x*<sup>2</sup> <sup>2</sup>. This is the *Mexican hat potential*. Thus the classical Hamiltonian *h*(*p*,*q*) = *p*2/2*m*+*V*(*q*), similarly with *p*<sup>2</sup> = *p*<sup>2</sup> <sup>1</sup>+ *p*<sup>2</sup> 2, is *SO*(2)-invariant, and the set of classical ground states

$$\mathcal{E}\_0 = \{(p, q) \in T^\*\mathbb{R}^2 \mid p = 0, q^2 = a^2\} \tag{10.302}$$

is the SO(2)-orbit through e.g. the point (*p*<sup>1</sup> = *p*<sup>2</sup> = 0,*q*<sup>1</sup> = *a*,*q*<sup>2</sup> = 0). Unlike the one-dimensional case, the set of ground states is now connected and forms a circle in phase space, on which the symmetry group *SO*(2) acts. The intuition behind the Goldstone Theorem is that a particle can freely move in this circle at no cost of energy. If we look at mass as inertia, such motion is "massless", as there is no obstruction. However, this intuition is only realized in quantum field theory. In quantum mechanics, the ground state of the Hamiltonian (10.6) (now acting on *L*2(R2)) remains unique, as in the one-dimensional case. In polar coordinates (*r*,φ) we have

$$h\_{\hbar} = -\frac{\hbar^2}{2m} \left( \frac{\partial^2}{\partial r^2} + \frac{1}{r} \frac{\partial}{\partial r} + \frac{1}{r^2} \frac{\partial^2}{\partial \phi^2} \right) + V(r), \tag{10.303}$$

with *V*(*r*) = <sup>1</sup> <sup>4</sup>λ(*r*<sup>2</sup> <sup>−</sup>*a*2)2. With

$$L^2(\mathbb{R}^2) \cong L^2(\mathbb{R}^+) \otimes \ell^2(\mathbb{Z}) \tag{10.304}$$

under Fourier transformation in the angle variable, this becomes

$$h\_{\hbar}\Psi(r,n) = \left(-\frac{\hbar^2}{2m}\left(\frac{\partial^2}{\partial r^2} + \frac{1}{r}\frac{\partial}{\partial r} - \frac{n^2}{r^2}\right) + V(r)\right)\Psi(r,n). \tag{10.305}$$

Since *h*¯ <sup>2</sup>*n*2/2*mr*<sup>2</sup> is positive, the ground state ψ(0) *<sup>h</sup>*¯ has <sup>ψ</sup>(0) *<sup>h</sup>*¯ (*r*,*n*) = 0 for all *n* = 0, and hence it is *SO*(2)-invariant, since the *SO*(2)-action on *L*2(R2) becomes

$$
\mu\_{\theta} \Psi(r, n) = \exp(in\theta) \Psi(r, n), \tag{10.306}
$$

after a Fourier-transform. Indeed, from a group-theoretical point of view, the unitary isomorphism (10.304) is nothing but the decomposition

$$L^2(\mathbb{R}^2) \cong \bigoplus\_{n \in \mathbb{Z}} H\_n,\tag{10.307}$$

where *Hn* <sup>=</sup> *<sup>L</sup>*2(R+) for all *<sup>n</sup>*, but with <sup>φ</sup>*<sup>n</sup>* <sup>∈</sup> *Hn* transforming under *SO*(2) as

$$
\mu\_{\theta} \phi\_n(r) = \exp(in\theta) \phi\_n(r) \ (\theta \in [0, 2\pi]). \tag{10.308}
$$

The *SO*(2)-invariant subspace of *L*2(R2), then, is precisely the space *H*<sup>0</sup> in which ψ(0) *<sup>h</sup>*¯ lies. This is analogous to the situation occurring in one dimension higher (i.e. R3) with e.g. the hydrogen atom: in that case, the symmetry group is *SO*(3), and *L*2(R3) decomposes accordingly as

$$L^2(\mathbb{R}^3) \cong \bigoplus\_{j \in \mathbb{N}} H\_j;\tag{10.309}$$

$$H\_f = L^2(\mathbb{R}^+) \otimes \mathbb{C}^{2j+1}.\tag{10.310}$$

The ground state for a spherically symmetric potential, then, lies in *H*<sup>0</sup> and is *SO*(3) invariant. For our purposes the relevant comparison is with the one-dimensional case: the decomposition of *<sup>L</sup>*2(R) under the natural <sup>Z</sup>2-action *<sup>u</sup>*−1ψ(*x*) = <sup>ψ</sup>(−*x*) is

$$L^2(\mathbb{R}) = H\_0 \oplus H\_1 \tag{10.311}$$

$$H\_l = \{ \Psi \in L^2(\mathbb{R}) \mid \Psi(\mathbf{x}) = (-1)^l \Psi(-\mathbf{x}) \}, \ i = 0, 1. \tag{10.312}$$

This time, *<sup>H</sup>*<sup>+</sup> is the <sup>Z</sup>2-invariant subspace containing the ground state <sup>ψ</sup>(0) *<sup>h</sup>*¯ . Being Z2-invariant, ψ(0) *<sup>h</sup>*¯ is has peaks above both classical minima <sup>±</sup>*a*; in fact, <sup>ψ</sup>(0) *<sup>h</sup>*¯ is realvalued and strictly positive. The ground state of the corresponding two-dimensional system, seen as an element of *L*2(R2), is just this wave-function ψ(0) *<sup>h</sup>*¯ extended from R to R<sup>2</sup> by rotational invariance. Hence the ground state remains real-valued and strictly positive, with peaks about the circle of classical minima in R2.

Let us recall the situation for *<sup>d</sup>* <sup>=</sup> 1 (cf. §10.1). The first excited state <sup>ψ</sup>(1) *<sup>h</sup>*¯ lies in *H*1; it is real-valued, like ψ(0) *<sup>h</sup>*¯ , but since it has to satisfy <sup>ψ</sup>(1) *<sup>h</sup>*¯ (−*x*) = −ψ*h*¯(*x*), it cannot be positive. Indeed, with a suitable choice of phase, ψ(1) *<sup>h</sup>*¯ has one positive peak above *a* and the same peak but now negative below −*a*. Then the wave-function

$$
\Psi\_{\hbar}^{\pm} = (\Psi\_{\hbar}^{(0)} \pm \Psi\_{\hbar}^{(1)}) \sqrt{2}, \tag{10.313}
$$

is peaked above <sup>±</sup>*<sup>a</sup>* alone (i.e., the negative peak of <sup>±</sup>ψ(1) *<sup>h</sup>*¯ below ∓*a* exactly cancels the corresponding peak of ψ(0) *<sup>h</sup>*¯ ). The classical limit of <sup>ψ</sup>(0) *<sup>h</sup>*¯ comes out as the mixed state <sup>1</sup> <sup>2</sup> (ω<sup>+</sup> <sup>0</sup> + ω<sup>−</sup> <sup>0</sup> ), where ω<sup>±</sup> <sup>0</sup> = (*p* = 0,±*a*), but each state ψ<sup>±</sup> *<sup>h</sup>*¯ has the pure state ω± <sup>0</sup> as its classical limit. The latter are ground states, and hence in particular they are time-independent, because the energy difference *<sup>E</sup>*(1) <sup>−</sup> *<sup>E</sup>*(0) between <sup>ψ</sup>(1) *<sup>h</sup>*¯ and ψ(0) *<sup>h</sup>*¯ vanishes (even exponentially fast) as *h*¯ → 0.

A similar but more complicated situation arises in *d* = 2. The role of the pair

$$\left(\boldsymbol{\Psi}\_{\hbar}^{(0)} \in H\_0, \boldsymbol{\Psi}\_{\hbar}^{(1)} \in H\_1\right)$$

is now played by an infinite tower of unit vectors

$$\left(\boldsymbol{\Psi}\_{\hbar}^{(n)} \in H\_n, n \in \mathbb{Z}\right),$$

where ψ(*n*) *<sup>h</sup>*¯ is the lowest energy eigenstate (for *hh*¯ in (10.305)) in *Hn* <sup>⊂</sup> *<sup>L</sup>*2(R2). The analogue of the states ψ± *<sup>h</sup>*¯ for *d* = 1 involves a limit which heuristically is like

$$\lim\_{N \to \infty} \boldsymbol{\Psi}\_{\hbar}^{(N, \boldsymbol{\theta})} = \frac{1}{\sqrt{2N + 1}} \sum\_{n = -N}^{N} \boldsymbol{\mu}\_{\boldsymbol{\theta}} \boldsymbol{\Psi}\_{\hbar}^{(n)},\tag{10.314}$$

but this limit does not exist in *<sup>L</sup>*2(R2). As in §10.1, we instead rely on the technique explained around (10.4), which makes the unit vectors ψ(*N*,θ) *<sup>h</sup>*¯ converge to some probability measure μ<sup>θ</sup> *<sup>h</sup>*¯ on <sup>R</sup><sup>2</sup> as *<sup>N</sup>* <sup>→</sup> <sup>∞</sup>. In the subsequent limit *<sup>h</sup>*¯ <sup>→</sup> 0, one obtains a probability measure μ<sup>θ</sup> <sup>0</sup> concentrated on a suitable point in the orbit of classical ground states (10.302). Similarly, in the same sense the ground state ψ(0) *<sup>h</sup>*¯ converges to a probability measure supported by all of E0.

To the extent that there is a Goldstone Theorem in classical mechanics, it would state that motion in the orbit E<sup>0</sup> is free. That is, at fixed (*r* = *a*, *pr* = 0), where *pr* is the radial component of momentum, one has an effective Hamiltonian

$$h\_a(p\_\phi, \phi) = \frac{p\_\phi^2}{2ma^2},\tag{10.315}$$

whose time-independent states (*p*<sup>φ</sup> = 0,φ0) for arbitrary φ<sup>0</sup> ∈ [0,2π) yield the ground states of the system, and whose "excited states"

$$(p\_\phi(t), \phi(t)) = \left(p\_\phi(0), \phi(0)\right) + \frac{p\_\phi(0)t}{ma^2}\right) \tag{10.316}$$

give motion along the orbit E<sup>0</sup> with effective mass *ma*2, whose energy converges to zero as *p*<sup>φ</sup> → 0. However, since massless particles (whose existence is the main conclusion of the usual Goldstone Theorem) are not defined in classical mechanics, we now turn to relativistic field theory (with which we assume some familiarity).

We now illustrate SSB in classical field theory through a simple example, where the symmetry group is *G* = *SO*(*N*), but whenever write things down in such a way that the generalization to arbitrary scalar field theories is obvious. Suppose we have *N* real scalar fields ϕ ≡ (ϕ1,...,ϕ*N*), on which *SO*(*N*) acts in the defining representation on R*N*. Following the physics literature, from now on we sum over repeated indices like *i* and μ (*Einstein summation convention*). Let the Lagrangian

$$
\mathcal{L}^{\varrho} = \frac{1}{2} \partial\_{\mu} \mathfrak{q}\_{i} \partial^{\mu} \mathfrak{q}\_{i} - V(\mathfrak{q}), \tag{10.317}
$$

contain an *SO*(*N*)-invariant potential *<sup>V</sup>*, typically of the form (with <sup>ϕ</sup><sup>2</sup> <sup>≡</sup> <sup>∑</sup>*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> ϕ<sup>2</sup> *i* )

$$V(\boldsymbol{\varphi}) = -\frac{m^2}{2}\boldsymbol{\varphi}^2 + \frac{\lambda}{4}\boldsymbol{\varphi}^4,\tag{10.318}$$

where λ > 0, but *m*<sup>2</sup> may have either sign. If *m*<sup>2</sup> < 0, the minimum of *V* lies at ϕ = 0, but if *m*<sup>2</sup> > 0 the minima form the *SO*(*N*)-orbit through

$$\mathfrak{g}^c = (\mathfrak{v}, \mathbf{0}, \cdots, \mathbf{0});\tag{10.319}$$

$$\nu \equiv m/\sqrt{\mathcal{A}} = ||\Phi^c||.\tag{10.320}$$

The idea is that the physical fields are excitations of the "vacuum state" ϕ*c*, so that, instead of ϕ, as the appropriate "small oscillation" field one should use

$$\mathcal{X}(\mathbf{x}) = \boldsymbol{\mathfrak{q}}(\mathbf{x}) - \boldsymbol{\mathfrak{q}}^{c}. \tag{10.321}$$

Consequently, the potential is expanded in a Taylor series for small χ as

$$V(\boldsymbol{\varphi}) = V(\boldsymbol{\varphi}^c) + \frac{1}{2}V\_{ij}''\chi\_i\chi\_j + O(\chi^3);\tag{10.322}$$

$$V\_{ij}^{\prime\prime} \equiv \frac{\partial^2 V}{\partial \!\!\! \! \! \! \! \! \! / \partial \! \! \! \! / \partial \! \! \! /} (\Phi^c). \tag{10.323}$$

Note that the linear term vanishes because *V* (ϕ*c*) = 0. We now use the *SO*(*N*) invariance of *<sup>V</sup>*, i.e., *<sup>V</sup>*(*g*ϕ) = *<sup>V</sup>*(ϕ) for all *<sup>g</sup>* <sup>∈</sup> *SO*(*N*). For *Ta* <sup>∈</sup> <sup>g</sup> (i.e. the Lie algebra of *G*, realized by anti-symmetric traceless *N* ×*N* matrices) this yields

$$\frac{d}{dt}V(e^{lT\_a}\boldsymbol{\varphi})\_{t=0} = 0 \Leftrightarrow \frac{\partial V(\boldsymbol{\varphi})}{\partial \boldsymbol{\varphi}\_l}(T\_a)\_{lj}\boldsymbol{\varphi}\_j = 0. \tag{10.324}$$

Differentiation with respect to ϕ*<sup>k</sup>* and putting ϕ = ϕ*<sup>c</sup>* then gives

$$V\_{ik}^{\prime\prime}(T\_a)\_{ij} \mathfrak{g}\_j^c = \mathbf{0}.\tag{10.325}$$

In general, let *<sup>H</sup>* <sup>⊂</sup> *<sup>G</sup>* be the stabilizer of <sup>ϕ</sup>*c*, i.e., *<sup>g</sup>* <sup>∈</sup> *<sup>H</sup>* iff *<sup>g</sup>*ϕ*<sup>c</sup>* <sup>=</sup> <sup>ϕ</sup>*c*. In our example (10.318) - (10.319), we evidently have *<sup>H</sup>* <sup>=</sup> *SO*(*<sup>N</sup>* <sup>−</sup> <sup>1</sup>). Then *Ta*ϕ*<sup>c</sup>* <sup>=</sup> 0 for all generators *Ta* of the Lie algebra <sup>h</sup> of *<sup>H</sup>*, so that there are

$$M \equiv \dim(G) - \dim(H) = \dim(G/H) = \dim(G \cdot \mathfrak{g}^c) \tag{10.326}$$

linearly independent null eigenvectors of *V* (seen as an *N* ×*N* matrix). This number equals the dimension of the submanifold of R*<sup>N</sup>* where *V* assumes its minimum. In our example we have *M* = *N* −1, since dim(*SO*(*N*)) = <sup>1</sup> <sup>2</sup>*N*(*N* −1). We now perform an affine field redefinition, based on an affine coordinate transformation in R*<sup>N</sup>* that diagonalizes the matrix *V*. The original (real) fields were ϕ = (ϕ1,...,ϕ*N*), and the new (real) fields are (χ1,θ2,··· ,θ*N*), with

$$\mathcal{X}\_{\mathrm{l}} = \mathfrak{q}\_{\mathrm{l}} - \mathrm{v},\tag{10.327}$$

as in (10.321), and the *Goldstone fields* are defined, also in general, by

$$
\Theta\_a = \frac{1}{\nu} \langle T\_a \Phi^c, \Phi \rangle = \frac{1}{\nu} (T\_a)\_{ij} \Phi^c\_j \Phi\_i. \tag{10.328}
$$

Here ·,· denotes the inner product in <sup>R</sup>*N*, and we have chosen a basis of <sup>g</sup> in which the elements (*T*1,...,*T*dim(*<sup>H</sup>*)) form a basis of <sup>h</sup>, completed by *<sup>M</sup>* further elements (*T*dim(*H*)+1,...*T*dim(*G*)+1), so as to have basis of g. The index *<sup>a</sup>* in (10.328), then, runs from dim(*H*) +1 to dim(*G*), so that there are *M* Goldstone fields, cf. (10.326). In our running example, this number was shown to be *M* = *N* − 1, and in view of (10.319), the field θ*<sup>a</sup>* = (*Ta*)*i*1ϕ*<sup>i</sup>* is a linear combination of the ϕ<sup>2</sup> till ϕ*N*.

The simplest example is *N* = 2, with potential (10.318) and *m*<sup>2</sup> > 0. With the single generator *<sup>T</sup>* <sup>=</sup> <sup>−</sup>*i*σ2, we obtain <sup>θ</sup> <sup>=</sup> <sup>ϕ</sup>2. Since *<sup>V</sup>* <sup>=</sup> diag(2*m*2,0), we see that the mass term −<sup>1</sup> 2*m*2ϕ<sup>2</sup> <sup>1</sup> in (10.318) (with ϕ<sup>2</sup> = ϕ<sup>2</sup> <sup>1</sup> + ϕ<sup>2</sup> <sup>2</sup> ) changes from the "wrong" sign <sup>−</sup>*m*<sup>2</sup> to the 'right' sign <sup>+</sup>2*m*<sup>2</sup> in (10.322), whilst <sup>−</sup><sup>1</sup> 2*m*2ϕ<sup>2</sup> <sup>2</sup> in (10.318) disappears, so that the field θ comes out to be massless. Indeed, this is the point of the introduction of the Goldstone fields: in view of (10.325) and (10.328), the Goldstone fields do not occur in the quadratic term in (10.322) and hence they are massless, in satisfying a field equation of the form ∂μ <sup>∂</sup> <sup>μ</sup> <sup>θ</sup>*<sup>a</sup>* <sup>=</sup> ···, where ··· does not contain any term linear in any field. This proves the *classical Goldstone Theorem*:

Theorem 10.26. *Suppose that a compact Lie group G* ⊂ *SO*(*N*) *acts on N real scalar fields* ϕ = (ϕ1,...,ϕ*N*)*, leaving the potential V in the Lagrangian* (10.317) *invariant. If G is spontaneously broken to an unbroken subgroup H* ⊂ *G (in the sense that the stability group of some point* ϕ*<sup>c</sup> in the G-orbit minimizing V is H), then there are at least* dim(*G*/*H*) *massless fields, i.e., there is a field transformation*

$$(\mathfrak{q}\_1, \dots, \mathfrak{q}\_N) \mapsto (\mathfrak{X}\_1, \dots, \mathfrak{X}\_{N-M}, \mathfrak{G}\_1, \dots, \mathfrak{G}\_M) \text{ ( $M = \dim(G) - \dim(H)$ )}, \quad (10.329)$$

*that is invertible in a neighborhood of* ϕ = ϕ*c, such that the potential V*(ϕ)*, reexpressed in the fields* χ *and* θ*, has no quadratic terms in* θ*.*

The local invertibility of the field redefinition around <sup>ϕ</sup>*<sup>c</sup>* <sup>=</sup> 0 is crucial; in our example, where <sup>χ</sup> <sup>≡</sup> <sup>χ</sup><sup>1</sup> <sup>=</sup> <sup>ϕ</sup><sup>1</sup> <sup>−</sup>*<sup>v</sup>* and <sup>θ</sup>*<sup>a</sup>* <sup>=</sup> *<sup>T</sup><sup>a</sup> <sup>i</sup>*1ϕ*i*, this may be checked explicitly.

An alternative proof of Theorem 10.26 uses nonlinear Goldstone fields, viz.

$$\boldsymbol{\mathfrak{p}}(\boldsymbol{x}) = e^{\frac{1}{\nu}\theta\_{a}(\boldsymbol{x})T\_{a}}(\boldsymbol{\mathfrak{p}}^{c} + \boldsymbol{\mathcal{X}}(\boldsymbol{x})),\tag{10.330}$$

where the sum over *a* (implicit in the Einstein summation convention) ranges from 1 to *<sup>M</sup>*, *<sup>v</sup>* <sup>=</sup> ϕ*c*, and the fields <sup>χ</sup> = (χ1,...,χ*N*−*M*) are chosen orthogonal (in <sup>R</sup>*N*) to each *Ta*ϕ*c*, *a* = 1,...,*M*, and hence to the θ*a*. Provided that the generators of *SO*(*N*) (and hence of *G* ⊂ *SO*(*N*)) have been chosen such that

$$
\langle T\_a \mathfrak{p}^c, T\_b \mathfrak{p}^c \rangle = \nu^2 \mathfrak{d}^{ab}, \tag{10.331}
$$

the fields θ*<sup>a</sup>* defined by (10.330) coincide with the fields in (10.328) up to quadratic terms in χ and θ; to see this, expand the exponential and also use the fact that both *Ta*ϕ*c*,ϕ*c* and *Ta*ϕ*c*,χ vanish. This transformation is only well defined if *<sup>v</sup>* <sup>=</sup> 0, i..e., if SSB from *G* to *H* occurs, and its existence implies the Goldstone Theorem 10.26, for by (10.330) and *G*-invariance, *V*(ϕ) is independent of θ.

The Goldstone Theorem can be derived in quantum field theory, but in the spirit of this chapter we will discuss it rigorously for quantum spin systems. Far from considering the most general case, we merely treat the simplest setting. We assume that *A* is a quasi-local C\*-algebra given by (8.130), with *H* = C*n*. Furthermore:


$$
\gamma\_{\mathfrak{g}} \circ \mathfrak{a}\_{(\mathfrak{x},t)} = \mathfrak{a}\_{(\mathfrak{x},t)} \circ \gamma\_{\mathfrak{g}} \ ((\mathfrak{x},t) \in \mathbb{Z}^d \times \mathbb{R}, \mathfrak{g} \in G). \tag{10.332}
$$


$$G\_a = \{ \exp(sT\_a), s \in \mathbb{R}, T\_a \in \mathfrak{g} \}. \tag{10.333}$$

5. There is an *n*-tuple ϕ = (ϕ1,...,ϕ*n*) of local operators ϕα ∈ *Mn*(C) that transforms under *G* by ϕ → *ug*ϕ*u*<sup>∗</sup> *<sup>g</sup>* = γ*g*(ϕ), and defines an order parameter φ*<sup>a</sup>* by

$$\mathfrak{d}\_a = \mathfrak{d}\_a \mathfrak{g} \equiv \frac{d}{ds} \left( \chi\_{\exp(sT\_a)}(\mathfrak{g}) \right)\_{|s=0},\tag{10.334}$$

at least for SSB of *Ga* (as above) in that, cf. Definition 10.6,

$$
\alpha(\delta\_a \varphi) \neq 0.\tag{10.335}
$$

6. Writing *j* 0 *<sup>a</sup>* = *iu* (*Ta*) ∈ *Mn*(C), it follows that δ*a*ϕ = −*i*[ *j* 0 *<sup>a</sup>*,ϕ], and hence that

$$\delta\_a \boldsymbol{\upvarphi}(\mathbf{x}) = -i \lim\_{\Lambda \nearrow \mathbb{Z}^d} \sum\_{\mathbf{y} \in \Lambda} [j\_a^0(\mathbf{y}), \boldsymbol{\upvarphi}(\mathbf{x})] \ (\mathbf{x} \in \mathbb{Z}^d), \tag{10.336}$$

since by (8.132) (i.e., Einstein locality) only the term *y* = *x* will contribute. Physicists then wish to define a charge by *Qa* <sup>=</sup> <sup>∑</sup>*y*∈Z*<sup>d</sup> <sup>j</sup>* 0 *<sup>a</sup>*(*y*) and write (10.336) as δ*a*ϕ(*x*) = −*i*[*Qa*,ϕ(*x*)], but *Qa* does not exist precisely in the case of SSB!

Eq. (10.336) motivates the crucial assumption for the Goldstone Theorem, viz.

$$\mathfrak{so}(\delta\_a \mathfrak{g}(\mathbf{x}, t)) = -i \lim\_{\Lambda \nearrow \mathbb{Z}^d} \sum\_{\mathbf{y} \in \Lambda} \mathfrak{so}([j\_a^0(\mathbf{y}), \mathfrak{g}(\mathbf{x}, t)]) \text{ ( $\mathbf{x} \in \mathbb{Z}^d$ , t \in \mathbb{R})}, \qquad (10.337)$$

which incorporates the condition that the sum over *y* converge absolutely. Although (10.337) at first sight *softens* (10.336) in turning an operator equation into a numerical one, in fact (10.337) decisively *sharpens* (10.336) by involving the time-dependence of ϕ, whose propagation speed should be sufficiently small for enabling the limit in (10.336) to catch up with the limit in (10.337). As such, eq. (10.337) is satisfied with short-range forces, but the Meissner effect in superconductivity and the closely related Higgs mechanism in gauge theories (both of which circumvents the Goldstone Theorem) are possible precisely because in those cases (10.337) fails (at least in physical gauges, see also §10.10).

7. Finally, we make two assumptions just for convenience, namely

$$
\mathfrak{q}\_{\mathfrak{a}}(\mathfrak{x})^\* = \mathfrak{q}\_{\mathfrak{a}}(\mathfrak{x});\tag{10.338}
$$

$$
\mathfrak{o}(\mathfrak{q}\_{\mathfrak{a}}(\mathfrak{x})) = 0.\tag{10.339}
$$

If these are not the case, one could simply take real and imaginary components of ϕα and/or redefine ϕα as ϕ˜<sup>α</sup> = ϕα −ω(ϕα)· 1*A*, so that ω(ϕ˜α(*x*)) = 0.

The Goldstone Theorem provides information about the joint-energy momentum spectrum of the theory at hand. To define this notion, we exploit the fact that from assumption no. (3) and Corollary 9.12 we obtain a unitary representation *u*<sup>ω</sup> of the (locally compact) abelian space-time translation group *<sup>A</sup>* <sup>=</sup> <sup>Z</sup>*<sup>d</sup>* <sup>×</sup><sup>R</sup> on the GNSrepresentation space *H*<sup>ω</sup> induced by ω. The SNAG-Theorem C.114 applied to *A*, with dual *<sup>A</sup>*<sup>ˆ</sup> <sup>=</sup> <sup>T</sup>*<sup>d</sup>* <sup>×</sup><sup>R</sup> (cf. Proposition C.108), then yields a projection-valued measure

$$e\_{\mathfrak{a}} : \mathcal{\mathcal{O}}(\mathbb{R} \times \mathbb{T}^d) \to \mathcal{\mathcal{P}}(H\_{\mathfrak{a}}),\tag{10.340}$$

as a map from the Borel sets in <sup>R</sup>×T*<sup>d</sup>* to the projection lattice in *<sup>B</sup>*(*H*ω), such that

$$1\_{H\_{\mathfrak{A}}} = \int\_{\mathbb{T}^d} \int\_0^\infty de(E, k);\tag{10.341}$$

$$\mu\_{\mathfrak{U}}(\mathbf{y},t) = \int\_{\mathbb{T}^d} \int\_0^\infty de(E,k) \, e^{i(Et-\mathbf{y}\cdot\mathbf{k})} \text{ ( $\mathbf{y} \in \mathbb{Z}^d$ , t \in \mathbb{R})}.\tag{10.342}$$

Here *<sup>k</sup>* = (*k*1,...,*ld*), *<sup>y</sup>*·*<sup>k</sup>* <sup>=</sup> <sup>∑</sup>*<sup>d</sup> <sup>i</sup>*=<sup>1</sup> *yiki*, and we have reduced the integration range over *E* (which *a priori* would be R) to R+. Indeed, by Stone's Theorem we have *u*ω(*t*) = exp(*ith*ω), where σ(*h*ω) ⊂ [0,∞) because ω is a ground state by assumption, and the support of *<sup>e</sup>* is evidently contained in <sup>Z</sup>*<sup>d</sup>* <sup>×</sup>σ(*h*ω) (cf. Definition A.16).

Definition 10.27. *The* joint energy-momentum spectrum σ(*h*ω, *p*ω) *of a spacetime invariant state* <sup>ω</sup> *(i.e.,* <sup>ω</sup> ◦ <sup>α</sup>(*x*,*t*) <sup>=</sup> <sup>ω</sup>*,* (*x*,*t*) <sup>∈</sup> <sup>Z</sup>*<sup>d</sup>* <sup>×</sup> <sup>R</sup>*) is the support of the projection-valued measure e*<sup>ω</sup> *associated to the* GNS*-representation* πω*, i.e., the smallest closed set* <sup>σ</sup>(*h*ω, *<sup>p</sup>*ω) <sup>⊂</sup> <sup>T</sup>*<sup>d</sup>* <sup>×</sup><sup>R</sup> *such that e*((T*<sup>d</sup>* <sup>×</sup>R)\σ(*h*ω, *<sup>p</sup>*ω)) = <sup>0</sup>*.*

The notation σ(*h*ω, *p*ω) is purely symbolic here, since (as opposed to the continuum case) the group Z*<sup>d</sup>* of spatial translations is discrete and hence has no generators *p*ω.

Since *u*ω(*x*,*t*)Ωω = Ωω, the origin (0,0) certainly lies in σ(*h*ω, *p*ω), with

$$e\_{\mathfrak{a}}(0,0) = |\mathfrak{Q}\_{\mathfrak{a}}\rangle\langle\mathfrak{Q}\_{\mathfrak{a}}|,\tag{10.343}$$

which by Theorem 9.14 is the unique <sup>T</sup>*<sup>d</sup>* <sup>×</sup> <sup>R</sup>-invariant state in *<sup>H</sup>*ω. Denoting this contribution to *e*<sup>ω</sup> by *e* (0) <sup>ω</sup> , in many physical theories one has *e*<sup>ω</sup> = *e* (0) <sup>ω</sup> +*e* (1) <sup>ω</sup> +···, where *e* (1) <sup>ω</sup> is supported on the graph of some continuous function *k* → ε*<sup>k</sup>* ≥ 0, i.e.,

$$\{(k, \mathfrak{e}\_k), k \in \mathbb{T}^d\} \subset \sigma(h\_{\mathfrak{a}\mathfrak{e}}, p\_{\mathfrak{a}\mathfrak{e}}) \subset \mathbb{T}^d \times \mathbb{R}.\tag{10.344}$$

The joint energy-momentum spectrum may be studied in part by considering

$$\begin{split} f(\mathfrak{e},p) &= \sum\_{\mathbf{y} \in \mathbb{Z}^d} \int\_{-\infty}^{\infty} dt \, e^{-i\varepsilon t + ip\cdot(\mathbf{x}-\mathbf{y})} \, \mathfrak{a}([J\_{\mathfrak{a}}^0(\mathbf{y}), \mathfrak{e}(\mathbf{x},t)]) \\ &= 2i \sum\_{\mathbf{y} \in \mathbb{Z}^d} \int\_{-\infty}^{\infty} dt \, e^{-i\varepsilon t + ip\cdot(\mathbf{x}-\mathbf{y})} \mathrm{Im} \langle \mathfrak{A}\_{\mathfrak{a}\mathfrak{a}}, \mathfrak{a}\_{\mathfrak{a}\mathfrak{a}}(j\_{\mathfrak{a}}^0(0)) e^{i\hbar h\mathfrak{a}} u\_{\mathfrak{a}\mathfrak{a}}(\mathbf{y}) \pi\_{\mathfrak{a}\mathfrak{a}}(\mathfrak{p}\_{\mathfrak{a}}(0)) \Omega\_{\mathfrak{a}\mathfrak{a}} \rangle \\ &= \int\_{\mathbb{T}^d} \int\_0^{\infty} \left( \langle \mathfrak{A}\_{\mathfrak{a}\mathfrak{a}}, \pi\_{\mathfrak{a}\mathfrak{a}}(j\_{\mathfrak{a}}^0(0)) de\_{\mathfrak{a}\mathfrak{a}}(E,k) \, \pi\_{\mathfrak{a}\mathfrak{a}}(\mathfrak{p}\_{\mathfrak{a}}(0)) \Omega\_{\mathfrak{a}\mathfrak{a}} \rangle \delta(\varepsilon - E) \delta(p - k) \\ & \quad - \langle \mathfrak{A}\_{\mathfrak{a}\mathfrak{a}} \pi\_{\mathfrak{a}}(\mathfrak{p}\_{\mathfrak{a}}(0)) de\_{\mathfrak{a}\mathfrak{a}}(E,k) \, \pi\_{\mathfrak{a}}(j\_{\mathfrak{a}}^0(0)) \Omega\_{\mathfrak{a}} \rangle \delta(\varepsilon + E) \delta(p + k) \right), \tag{10.345} \end{split}$$

i.e., the Fourier transform of the two-point function defined by *j* 0 *<sup>a</sup>* and ϕ, which is a distribution on the dual group <sup>T</sup>*<sup>d</sup>* <sup>×</sup> <sup>R</sup>; for the third equality we used a distributional version of the Fourier inversion formula (C.382). For example, if we replace *e*ω(*E*, *k*) by *e* (1) <sup>ω</sup> (*E*, *k*), then, since *e* (1) <sup>ω</sup> is absolutely continuous with respect to Haar measure *<sup>d</sup>dk* on <sup>T</sup>*d*, we see that *<sup>f</sup>*(ε, *<sup>p</sup>*) is proportional to <sup>δ</sup>(<sup>ε</sup> <sup>−</sup>ε*p*).

Theorem 10.28. *Under assumptions 1–7 (notably* (10.337) *and* SSB *of some continuous symmetry), the Hamiltonian h*<sup>ω</sup> *has continuous spectrum starting at zero and hence has no gap. If there is an excitation spectrum e*(1) <sup>ω</sup> *as explained above, with*

$$
\int \langle \Omega\_{\mathfrak{a}}, \pi\_{\mathfrak{a}}(j\_a^0(0)) de\_{\mathfrak{a}}^{(1)}(E, k) \, \pi\_{\mathfrak{a}}(\varphi\_{\mathfrak{a}}(0)) \Omega\_{\mathfrak{a}} \rangle \neq 0,\tag{10.346}
$$

*then the continuous function k* → ε*<sup>k</sup> defining the spectrum satisfies* ε<sup>0</sup> = 0*.*

*Proof.* Since the sum in (10.337) converges absolutely, the Fourier transform ˇ*f*(*t*, *p*) of *y* → ω([ *j* 0 *<sup>a</sup>*(*y*),ϕ(*x*,*t*)]) in *y* alone is continuous in *p*, and by (10.337) we have

$$
\dot{a}a(\delta\_a \varphi(x,t)) = \check{f}(t,0). \tag{10.347}
$$

By (10.332), the left-hand side is independent of *x* and *t*, hence the Fourier transform *f*(ε,0) of the right-hand side in *t* is proportional to δ(ε). Since (10.343) does not contribute to *f* by (10.339), the calculation (10.345) shows that *f*(ε,0) = 0 if σ(*h*ω) has a gap. But *f*(ε,0) = 0 by (10.335), and so σ(*h*ω) has no gap. Similarly, for the final claim note that *f*(ε,0) ∼ δ(ε −ε0) as well as *f*(ε,0) ∼ δ(ε). -

#### 10.10 The Higgs mechanism

We proceed to a discussion of SSB in gauge theories, especially with an eye on the Higgs Mechanism, which plays a central role in the Standard Model of high-energy physics (whose empirical confirmation was more or less finished with the discovery of the Higgs boson at CERN, announced on July 4, 2012).

We look at the *Abelian Higgs Model*, given by the Lagrangian

$$\mathcal{L}^{\rho} = -\frac{1}{4} F\_A^2 + \frac{1}{2} \langle D\_\mu^A \Phi, D\_\mu^A \Phi \rangle - V(\Phi), \tag{10.348}$$

where ϕ = (ϕ1,ϕ2) is a scalar doublet, the usual electromagnetic field strength is

$$F\_{\mu\nu} = \partial\_{\mu}A\_{\nu} - \partial\_{\nu}A\_{\mu},\tag{10.349}$$

in terms of which *F*<sup>2</sup> *<sup>A</sup>* = *F*μν*F*μν , and the covariant derivative is

$$D^A\_{\mu} \equiv \partial\_{\mu} - eA\_{\mu} \cdot T = \partial\_{\mu} \cdot 1\_2 + ieA\_{\mu} \cdot \sigma\_2. \tag{10.350}$$

Here *e* is some coupling constant, identified with the unit of electrical charge. We still assume that *<sup>V</sup>* only depends on ϕ<sup>2</sup> <sup>=</sup> ϕ,ϕ and hence is *SO*(2)-invariant.

The novel situation compared to (10.317) and the like is that, whereas (10.317) is invariant under *global SO*(2) transformations, the Lagrangian (10.348) is invariant under *local SO*(2) *gauge transformations* that depend on *x*, namely

$$\mathfrak{g}(\mathbf{x}) \mapsto e^{\mathfrak{a}(\mathbf{x}) \cdot T} \mathfrak{g}(\mathbf{x}) = \begin{pmatrix} \cos \mathfrak{a}(\mathbf{x}) - \sin \mathfrak{a}(\mathbf{x}) \\ \sin \mathfrak{a}(\mathbf{x}) \ \cos \mathfrak{a}(\mathbf{x}) \end{pmatrix} \cdot \begin{pmatrix} \mathfrak{g}\_1(\mathbf{x}) \\ \mathfrak{g}\_2(\mathbf{x}) \end{pmatrix}; \quad (10.351)$$

$$A\_{\mu}(\mathbf{x}) \mapsto A\_{\mu}(\mathbf{x}) + \frac{1}{e} \partial\_{\mu} \mathfrak{a}(\mathbf{x}). \tag{10.352}$$

We say that the *local gauge group* G = *C*∞(R*d*,*U*(1)) acts on the space of fields (*A*,ϕ) by (10.351) - (10.352). Now suppose *V* has a minimum at some constant value <sup>ϕ</sup>*<sup>c</sup>* <sup>=</sup> 0. In that case, any field configuration

$$\boldsymbol{\varphi}(\mathbf{x}) = \exp(\boldsymbol{\alpha}(\mathbf{x}) \cdot \boldsymbol{T}) \boldsymbol{\varphi}^c;\tag{10.353}$$

$$A\_{\mu}(\mathbf{x}) = (1/e)\partial\_{\mu}\mathfrak{a}(\mathbf{x}) \mid (\mathfrak{a} \in \mathcal{G}),\tag{10.354}$$

minimizes the action. Hence the possible "vacua" of the model comprise the (infinite-dimensional) orbit V of the gauge group through (*A* = 0,ϕ = ϕ*c*). Note that *DA* μϕ = 0 for (*A*,ϕ) ∈ V , i.e., ϕ is *covariantly* constant along the vacuum orbit (whereas for global symmetries it is constant full stop). Relative to the (arbitrary) choice (0,ϕ*c*) <sup>∈</sup> <sup>V</sup> , we then introduce real fields <sup>χ</sup> and <sup>θ</sup>, called the *Higgs field* and the *would-be Goldstone boson*, respectively, by (10.330), which now simply reads

$$
\begin{pmatrix} \mathfrak{g}\_1(\mathbf{x}) \\ \mathfrak{g}\_2(\mathbf{x}) \end{pmatrix} = e^{\frac{1}{\nu}\theta(\mathbf{x}) \cdot T} \cdot \begin{pmatrix} \nu + \mathfrak{z}(\mathbf{x}) \\ \mathbf{0} \end{pmatrix} . \tag{10.355}
$$

After this redefinition of the scalar fields, the Lagrangian (10.348) becomes

$$\mathcal{A}\mathcal{E} = -\frac{1}{4}F\_B^2 + \frac{1}{2}\partial\_\mu \chi \partial^\mu \chi + \frac{1}{2}e^2(\nu + \chi)^2 B\_\mu B^\mu - V(\nu + \chi, 0), \tag{10.356}$$

where *<sup>B</sup>*<sup>μ</sup> <sup>=</sup> *<sup>A</sup>*<sup>μ</sup> <sup>−</sup>(1/*ev*)∂μ <sup>θ</sup>, and *<sup>F</sup>*<sup>2</sup> *<sup>B</sup>* <sup>=</sup> *<sup>F</sup>*μν*F*μν for *<sup>F</sup>*μν <sup>=</sup> ∂μ*B*<sup>ν</sup> <sup>−</sup>∂ν*B*<sup>μ</sup> . This describes a vector boson *B* with mass term <sup>1</sup> 2*m*<sup>2</sup> *BB*μ*B*<sup>μ</sup> , with *m*<sup>2</sup> *<sup>B</sup>* = <sup>1</sup> <sup>2</sup> *e*2*v*<sup>2</sup> > 0 (as opposed to the massless vector field *A*), and a scalar field χ with mass term <sup>1</sup> 2*m*<sup>2</sup> <sup>χ</sup> χ2, with *m*<sup>2</sup> <sup>χ</sup> = (∂ <sup>2</sup>*V*/∂ φ<sup>2</sup> <sup>1</sup> )|(*v*,0) <sup>&</sup>gt; 0 (since *<sup>V</sup>* supposedly has a minimum at <sup>ϕ</sup>*<sup>c</sup>* = (*v*,0)).

This is the *Higgs mechanism*: the gauge field becomes massive, whilst the massless ("would-be") Goldstone boson disappears from the theory: it is (allegedly) "eaten" by the gauge field. Thus the scalar degree of freedom θ that seems lost is recovered as the longitudinal component of the massive vector field (which for a gauge field would have been an unphysical gauge degree of freedom, see below).

In the description just given, the Higgs mechanism in classical field theory is seen as a consequence of SSB. Remarkably, there is an alternative account of the Higgs mechanism, according to which it has nothing to do with SSB! Namely, we now perform a field redefinition analogous to (10.355) etc. straight away, viz.

$$
\begin{pmatrix} \mathfrak{q}\_1(\mathbf{x}) \\ \mathfrak{q}\_2(\mathbf{x}) \end{pmatrix} = e^{\mathfrak{q}(\mathbf{x}) \cdot T} \cdot \begin{pmatrix} \mathfrak{p}(\mathbf{x}) \\ \mathbf{0} \end{pmatrix}; \tag{10.357}
$$

$$A\_{\mu} = B\_{\mu} + (1/e)\partial\_{\mu}\theta. \tag{10.358}$$

This transformation is defined and invertible in a neighbourhood of any point (ρ0,θ0,*B*0, where ρ<sup>0</sup> > 0, θ<sup>0</sup> ∈ (−π,π), and *B*<sup>0</sup> is arbitrary. Each of these new fields is gauge-invariant: for the gauge transformation (10.351) becomes

$$
\theta(\mathbf{x}) \mapsto \theta(\mathbf{x}) + \alpha(\mathbf{x});\tag{10.359}
$$

$$
\rho(\mathbf{x}) \mapsto \rho(\mathbf{x}),
\tag{10.360}
$$

and in view of (10.352), *B* does not transform at all. The Lagrangian becomes

$$\mathcal{L}^{\rho} = -\frac{1}{4} F\_B^2 + \frac{1}{2} \partial\_{\mu} \rho \, \partial^{\mu} \rho + \frac{1}{2} e^2 \rho^2 B\_{\mu} B^{\mu} - V(\rho), \tag{10.361}$$

with *V*(ρ) ≡ *V*(ρ,0). This is a Lagrangian without any internal symmetries at all (not even Z2, since ρ > 0), but of course one can still look for classical vacua that minimize the energy and hence the potential *V*(ρ). If ρ = 0 is the absolue minimum, then the above field redefinition is *a fortiori* invalidated, but if *V* (*v*) = 0 for some *v* > 0, we proceed as before, introducing a Higgs field χ(*x*) = ρ(*x*) − *v*, and recovering the Lagrangian (10.356). This once again leads to the Higgs mechanism.

This can be generalized to the nonabelian case; since it suffices to explain the idea, we just discuss the *SU*(2) case. In (10.348), the scalar field ϕ = (ϕ1,ϕ2) is now complex, forming an *SU*(2) doublet, the brackets ·,· now denote the inner product in C2, the nonabelian gauge field is *A* = *Aa*σ*<sup>a</sup>* (where the Pauli matrices σ*a*, *a* = 1,2,3, form a self-adjoint basis of the Lie algebra of *SU*(2)), with associated field strength *<sup>F</sup>*μν <sup>=</sup> ∂μ*A*<sup>ν</sup> <sup>−</sup>∂ν*A*<sup>μ</sup> <sup>+</sup>*g*[*A*<sup>μ</sup> ,*A*<sup>ν</sup> ] and covariant derivative *DA* <sup>μ</sup> = ∂μ +*igA*<sup>μ</sup> .

With *F*<sup>2</sup> *<sup>A</sup>* = *F<sup>a</sup>* μν*F*μν *<sup>a</sup>* , the Lagrangian (10.348) is invariant under the transformations

$$
\mathfrak{sp}(\mathfrak{x}) \mapsto e^{i\mathfrak{a}\_{\mathfrak{a}}(\mathfrak{x})\mathfrak{a}\_{\mathfrak{a}}(\mathfrak{x})} \mathfrak{q}(\mathfrak{x});\tag{10.362}
$$

$$A\_{\mu}(\mathbf{x}) \mapsto e^{i\mathbf{q}\_{\mathfrak{d}}(\mathbf{x})\mathfrak{G}\_{\mathfrak{d}}(\mathbf{x})}(A\_{\mu}(\mathbf{x}) - (i/\mathfrak{g})\partial\_{\mu})e^{-i\mathfrak{g}\_{\mathfrak{d}}(\mathbf{x})\mathfrak{G}\_{\mathfrak{d}}(\mathbf{x})}.\tag{10.363}$$

The definition of the gauge-invariant fields *B* and ρ a la (10.357) - (10.358) is now `

$$
\begin{pmatrix} \mathfrak{op}\_1(\boldsymbol{x}) \\ \mathfrak{p}\_2(\boldsymbol{x}) \end{pmatrix} = e^{i\boldsymbol{\theta}\_d(\boldsymbol{x}) \cdot \boldsymbol{\sigma}\_d} \cdot \begin{pmatrix} \mathfrak{p}(\boldsymbol{x}) \\ \mathbf{0} \end{pmatrix};\tag{10.364}
$$

$$A\_{\mu}(\mathbf{x}) = e^{i\theta\_{\mathrm{d}}(\mathbf{x})\sigma\_{\mathrm{d}}(\mathbf{x})} (B\_{\mu}(\mathbf{x}) - (i/\mathrm{g})\partial\_{\mu})e^{-i\theta\_{\mathrm{d}}(\mathbf{x})\sigma\_{\mathrm{d}}(\mathbf{x})},\qquad(10.365)$$

which leads, *mutatis mutandis*, to the very same Lagrangian (10.361).

As a compromise between these two derivations of the Higgs mechanism, it is also possible to fix the gauge by picking the representative (ϕ,*A*) in each G -orbit for which ϕ2(*x*) = 0 and ϕ1(*x*) > 0; note that this so-called *unitary gauge* is ill-defined if ϕ1(*x*) = 0. Calling this unique representative (ρ,*B*), we are again led to (10.361).

Gauge field theories are *constrained systems*, in which the *apparent* degrees of freedom in the Lagrangian are not the *physical* ones. For free electromagnetism, the Lagrangian is L (*A*) = −<sup>1</sup> <sup>4</sup>*F*μν*F*μν , with *<sup>F</sup>*μν <sup>=</sup> ∂μ*A*<sup>ν</sup> <sup>−</sup> ∂ν*A*<sup>μ</sup> . In terms of the gauge-invariant fields *Ei* = *Fi*<sup>0</sup> = ∂*iA*<sup>0</sup> −∂0*Ai* and B = ∇×A, Maxwell's equations

$$\nabla \cdot \mathbf{E} = 0;\tag{10.366}$$

$$
\partial \mathbf{E}/\partial \mathbf{t} = \nabla \times \mathbf{B};\tag{10.367}
$$

$$\frac{\partial \mathbf{B}}{\partial t} = -\nabla \times \mathbf{E};\tag{10.368}$$

$$\nabla \cdot \mathbf{B} = 0,\tag{10.369}$$

then arise as follows: eqs. (10.366) and (10.367) correspond to the Euler–Lagrange equation for *A*<sup>0</sup> and *Ai*, respectively, whereas (10.368) and (10.369) immediately follow from the definitions of B and E in terms of *A*. The Maxwell equations are in Hamiltonian form, with canonical momenta Πμ <sup>=</sup> <sup>∂</sup><sup>L</sup> /∂*A*˙<sup>μ</sup> ; this yields <sup>Π</sup>*<sup>i</sup>* <sup>=</sup> <sup>−</sup>*Ei*, as well as the *primary constraint* Π<sup>0</sup> = 0. Nonetheless, the canonical Hamiltonian

$$h = \int d^3 \mathbf{x} \left( \Pi\_{\mu}(\mathbf{x}) \dot{\mathbf{A}}\_{\mu}(\mathbf{x}) - \mathcal{L}^{\rho}(\mathbf{x}) \right) = \int d^3 \mathbf{x} \left( \frac{1}{2} \mathbf{E}^2(\mathbf{x}) + \frac{1}{2} \mathbf{B}^2(\mathbf{x}) - A\_0(\mathbf{x}) \nabla \cdot \mathbf{E}(\mathbf{x}) \right),$$

is well defined. In the Hamiltonian formalism, Gauss' Law resurfaces as the *secondary constraint* stating that the primary constraint be preserved in time, viz.

$$\dot{H}\_0(\mathbf{x}) = -\frac{\delta h}{\delta A\_0(\mathbf{x})} = \nabla \cdot \mathbf{E}(\mathbf{x}) \equiv 0. \tag{10.370}$$

Since

$$\frac{d}{dt}\nabla \cdot \mathbf{E}(\mathbf{x}) = -\partial\_l(\delta h/\delta A\_l(\mathbf{x})) = -\partial\_l(\Delta A\_l - \partial\_l \nabla \cdot \mathbf{A}) = 0,\tag{10.371}$$

there are no "tertiary" constraints. Thus we have canonical phase space variables (E,A) and (Π0,*A*0), subject to (10.366) and to <sup>Π</sup>0(*x*) = 0 for each *<sup>x</sup>* <sup>∈</sup> <sup>R</sup>3, i.e.,

$$\Pi\_0(\lambda\_0) \equiv \int d^3x \,\Pi\_0(\mathbf{x})\lambda\_0(\mathbf{x}) = 0;\tag{10.372}$$

$$\Pi(\lambda) \equiv \int d^3x \nabla \cdot \mathbf{E}(\mathbf{x}) \lambda(\mathbf{x}) = 0,\tag{10.373}$$

for all (reasonable) functions λ<sup>0</sup> and λ on R3. The constraints (10.372) - (10.373) are *first class* in the sense of Dirac, which means that their Poisson brackets are equal to existing constraints (or zero). In the Hamiltonian formalism, the role of the space-time dependent gauge transformations of the Lagrangian theory is played by the canonical transformations generated by the first class constraints, i.e.,

$$\delta\_{\lambda\_0} A\_0(\mathbf{x}) = \{\Pi\_0(\lambda\_0), A\_0(\mathbf{x})\} = \lambda\_0(\mathbf{x});\tag{10.374}$$

$$
\delta\_{\lambda\_0} A\_i(\mathbf{x}) = \delta\_{\lambda\_0} E\_i(\mathbf{x}) = 0;\tag{10.375}
$$

$$
\delta\_{\lambda} \mathbf{A}(\mathbf{x}) = \nabla \lambda(\mathbf{x});\tag{10.376}
$$

$$
\delta\_\lambda \mathbf{E}(\mathbf{x}) = 0;\tag{10.377}
$$

$$
\delta\_{\lambda} A\_0(x) = 0. \tag{10.378}
$$

The holy grail of the Hamiltonian formalism is to find variables that are both *gauge invariant* and *unconstrained*. In our case, *A*<sup>μ</sup> = (*A*0,A) are unconstrained but gauge variant, whilst Πμ = (Π0,−E) are gauge invariant but constrained! Now write some vector field <sup>V</sup> as <sup>V</sup> <sup>=</sup> <sup>V</sup>*<sup>L</sup>* <sup>+</sup>V*<sup>T</sup>* , where <sup>V</sup>*<sup>L</sup>* <sup>=</sup> <sup>Δ</sup>−1∇(<sup>∇</sup> · <sup>V</sup>) is the longitudinal component, so that *V<sup>T</sup> <sup>i</sup>* = (δ*i j* <sup>−</sup>Δ−1∂*i*∂*j*)*Vj* is the transverse part. Then the physical variables of free electromagnetism are A*<sup>T</sup>* and E*<sup>T</sup>* . The physical Hamiltonian

$$h = \frac{1}{2} \int d^3 \mathbf{x} \left( \mathbf{E}^T \cdot \mathbf{E}^T - \mathbf{A}^T \cdot \Delta \mathbf{A}^T \right), \tag{10.379}$$

then, is well defined on the physical (or reduced) phase space, which is the subset of all (*A*<sup>μ</sup> ,Πμ ) where the constraints (10.373) hold, modulo gauge equivalence.

After this preparation, we now revisit the abelian Higgs model as a constrained Hamiltonian system. It is convenient to combine the two real scalar fields ϕ<sup>1</sup> and ϕ<sup>2</sup> into a single complex scalar field ϕ = (ϕ<sup>1</sup> +*i*ϕ2)/ √ 2, and treat ϕ and its complex conjugate ϕ as independent variables. The Lagrangian (10.348) then becomes

$$\mathcal{L}\mathcal{C} = -\frac{1}{4}F\_A^2 + \overline{D\_\mu^A \Phi} \cdot D\_\mu^A \varphi - V(\varphi, \overline{\varphi}),\tag{10.380}$$

with *DA* μϕ = (∂μ −*ieA*<sup>μ</sup> )ϕ, etc. The conjugate momenta Πμ to *A*<sup>μ</sup> are the same as for free electromagnetism, i.e., Π<sup>0</sup> = 0 and Π*<sup>i</sup>* = −*Ei*, and for ϕ we obtain

$$
\pi = \partial \mathcal{L}^\rho / \partial \Phi = \overline{D\_0^4 \Phi};\tag{10.381}
$$

$$
\overline{\mathfrak{R}} = \partial \lrcorner \mathcal{E}' / \partial \dot{\overline{\mathfrak{P}}} = D\_0^A \mathfrak{p}. \tag{10.382}
$$

The associated Hamiltonian *h* is equal to

$$\int d^3x \left( \frac{1}{2} \mathbf{E}^2 + \frac{1}{2} \mathbf{B}^2 - A\_0 (\nabla \cdot \mathbf{E} - j\_0) + \overline{\pi} \pi + \overline{D\_l^A \Phi} \cdot D\_l^A \boldsymbol{\varphi} + V(\boldsymbol{\varphi}, \overline{\boldsymbol{\varphi}}) \right), \quad (10.383)$$

where *j*<sup>0</sup> = *ie*(πϕ − πϕ) is the zero'th component of the Noether current. Hence the primary constraint remains Π<sup>0</sup> = 0, but the secondary constraint picks up an additional term and becomes ∇·E = *j*<sup>0</sup> (which remains Gauss' law!). The physical (i.e., gauge invariant and unconstrained) variables can be computed as

$$\mathfrak{g}\_{\mathbf{A}} = e^{ie\boldsymbol{\Delta}^{-1}\nabla \cdot \mathbf{A}} \mathfrak{g}, \; \overline{\mathfrak{g}}\_{\mathbf{A}} = e^{-ie\boldsymbol{\Delta}^{-1}\nabla \cdot \mathbf{A}} \overline{\mathfrak{g}};\tag{10.384}$$

$$\boldsymbol{\pi}\_{\rm A} = e^{-i e \boldsymbol{\Delta}^{-1} \nabla \cdot \mathbf{A}} \boldsymbol{\varphi}, \; \mathbb{Z}\_{\rm A} = e^{i e \boldsymbol{\Delta}^{-1} \nabla \cdot \mathbf{A}} \mathbb{R}, \tag{10.385}$$

plus the same transverse fields A*<sup>T</sup>* and E*<sup>T</sup>* , as in free electromagnetism. In terms of the transverse covariant derivative *D<sup>T</sup> <sup>i</sup>* <sup>=</sup> <sup>∂</sup>*<sup>i</sup>* <sup>−</sup>*ieA<sup>T</sup> <sup>i</sup>* , the physical Hamiltonian *h* is

$$\int d^3 \mathbf{x} \left( \frac{1}{2} (\mathbf{E}^T \cdot \mathbf{E}^T - \mathbf{A}^T \cdot \Delta \mathbf{A}^T - j\_0^A \Delta^{-1} j\_0^A) + \overline{\pi}\_A \pi\_\mathbf{A} + \overline{D\_i^T \overline{\varphi\_\mathbf{A}}} \cdot D\_i^T \varphi\_\mathbf{A} + V(\varphi\_\mathbf{A}, \overline{\varphi}\_\mathbf{A}) \right) . \tag{10.386}$$

The third term in (10.386) is the Coulomb energy, in which the charge density

$$j\_0^A = ie(\pi\_A \varphi\_A - \overline{\pi}\_A \overline{\varphi}\_A) \tag{10.387}$$

is the same as *j*<sup>0</sup> (since the latter is gauge invariant). Remarkably, the physical field variables carry a residual *global U*(1)-symmetry, viz.

$$
\varphi\_{\mathcal{A}} \mapsto \exp(i\alpha)\varphi\_{\mathcal{A}};\tag{10.388}
$$

$$
\pi\_{\mathcal{A}} \mapsto \exp(-i\alpha)\pi\_{\mathcal{A}};\tag{10.389}
$$

$$
\overline{\varphi}\_A \mapsto \exp(-i\alpha)\overline{\varphi}\_A;\tag{10.390}
$$

$$
\overline{\pi}\_A \mapsto \exp(i\alpha)\overline{\pi}\_A,\tag{10.391}
$$

and no change for A*<sup>T</sup>* and E*<sup>T</sup>* , under which the Hamiltonian (10.386) is invariant.

If *V* has a minimum at ϕ = ϕ = *v*, we recover the Higgs mechanism: redefining

$$\mathfrak{q}\_{\mathbb{A}} = \exp(i\theta/\nu)(\nu + \mathbb{X}),\tag{10.392}$$

and complex conjugate, and the reintroduction of the longitudinal components

$$A\_l^L = -(1/ev)\partial\_l \theta; \; E\_l^L = -ev\Delta^{-1}\partial\_l \pi\_\theta,\tag{10.393}$$

of the gauge field and its conjugate momentum, the Hamiltonian (10.386) becomes

$$\frac{1}{2}\int d^3x\left(\mathbf{E}^2 + \mathbf{B}^2 + \pi\_\mathbf{Z}^2 + \partial\_i\chi\partial\_i\chi + \frac{(\nabla\cdot\mathbf{E})^2}{e^2\nu^2} + e^2\nu^2\mathbf{A}^2 + V(\mathbf{v} + \chi)\right),\quad(10.394)$$

where A = A*<sup>T</sup>* + A*<sup>L</sup>* and E = E*<sup>T</sup>* + E*L*. This describes a massive vector field, and the would-be Goldstone boson θ has disappeared, as befits the Higgs mechanism!

It is fair to say that the Higgs mechanism in quantum field theory—and more generally, the notion of SSB in gauge theories—is poorly understood. Indeed, the entire quantization of gauge theories is not well understood, except at the perturbative level or on a lattice. The problems already come out in the abelian case with *d* = 3. The main culprit is Gauss' Law ∇ ·E = *j*0. One would naively expect this constraint to remain valid in quantum field theory as an operator equation, and this is indeed the case in so-called physical gauges like the Coulomb gauge (i.e. ∂*iAi* = 0). If we now look at condition (10.337) in §10.9, which for *G* = *U*(1) and for example δ ϕ<sup>1</sup> = ϕ<sup>2</sup> and δ ϕ<sup>2</sup> = −ϕ<sup>1</sup> for a charged field ϕ = (ϕ<sup>1</sup> +*i*ϕ2)/ √ 2, or δ ϕ = *i*ϕ, reads

$$\lim\_{\Lambda \nearrow \mathbb{R}^3} \int\_{\Lambda} d^3 \mathbf{y} \, \mathfrak{a}([j\_0(\mathbf{y}, 0), \mathfrak{q}\_\alpha(\mathbf{x}, t)]) = -i \mathfrak{a}(\mathcal{S} \mathfrak{q}\_\alpha(\mathbf{x}, t)), \tag{10.395}$$

then it is clear that (10.395) can only hold if charged fields are nonlocal. For by Gauss' Law the commutator [ *j*0(*y*,0),ϕα(*x*,*t*)] equals [∇ ·E(0, *y*),ϕα(*x*,*t*)], and by Gauss'(!) Theorem in vector calculus, all contributions to the left-hand side of (10.395) come from terms [*Ei*(0, *y*),ϕα(*x*,*t*)], with *y* ∈ ∂Λ (i.e., the boundary of <sup>Λ</sup>). These must remain nonzero if <sup>Λ</sup> \$ <sup>R</sup>3, at least if (10.395) holds. On the other hand, such nonlocality must be enforced by massless fields, which idea leads to one of the very few rigorous result about the Higgs mechanism (in the continuum):

Theorem 10.29. *In the Coulomb gauge the following conditions are equivalent:*


*Hence (contrapositively),* SSB *of U*(1) *by the state* ω *is only possible if* A *is massive. In that case, the Fourier transform of the two-point function* 0|ϕα(*x*, *x*0)*j a* <sup>0</sup>(*y*, *y*0)|0 *(cf. the proof of the Goldstone Theorem 10.28 in* §*10.9) has a pole at the mass of* A*.*

This theorem indeed yields the Higgs mechanism for say the abelian Higgs model in a specific physical gauge: note that the idea that the would-be Goldstone boson is eaten by the gauge field is already suggested by Gauss' Law, through which (minus) the canonical momentum E to A acquires *j*<sup>0</sup> as its longitudinal component; that is, the very same field that creates the Goldstone boson from the ground state.

In covariant gauges, all fields remain local, but (10.395) is rescued by the gaugefixing term added to the Lagrangian. For example, adding <sup>L</sup>*g f* <sup>=</sup> <sup>−</sup>(1/2<sup>ξ</sup> )(∂μ*A*<sup>μ</sup> )<sup>2</sup> to (10.348) leads to an equation of motion ∂μ*F*<sup>μ</sup> <sup>ν</sup> <sup>=</sup> *<sup>j</sup>*<sup>ν</sup> <sup>−</sup>∂ν ∂μ*A*<sup>μ</sup> , so that (discarding all surface terms by locality), one obtains

$$-i\mathfrak{a}(\delta\varphi\_{\mathfrak{a}}(\mathbf{x},t)) = \int\_{\mathbb{R}^3} d^3 \mathbf{y} \,\mathfrak{a}([\partial\_0^2 A\_0(\mathbf{y}, 0), \mathfrak{q}\_{\mathfrak{a}}(\mathbf{x}, t)]).\tag{10.396}$$

In the proof of the Goldstone Theorem, the massless Goldstone bosons do emerge, but they turn out to lie in some "unphysical subspace" of *H*<sup>ω</sup> (which, for local gauges, is not a Hilbert space but has zero- and negative norm states).

## Notes

In a philosophical context, the notion of *emergence* is usually traced to J.S. Mill (1843), who drew attention to 'a distinction so radical, and of so much importance, as to require a chapter to itself', namely the one between what Mill calls the principle of the 'Composition of Causes', according to which the joint effect of several causes is identical with the sum of their separate effects, and the negation of this principle. For example, in the context of his overall materialism, Mill believed that although all 'organised bodies' are composed of material parts,

'the phenomena of life, which result from the juxtaposition of those parts in a certain manner, bear no analogy to any of the effects which would be produced by the action of the component substances considered as mere physical agents. To whatever degree we might imagine our knowledge of the properties of the several ingredients of a living body to be extended and perfected, it is certain that no mere summing up of the separate actions of those elements will ever amount to the action of the living body itself.' Mill (1952 [1843], p. 243)

Mill launched what is now called *British Emergentism* (Stephan, 1992; McLaughlin, 2008; O'Connor & Wong, 2012), a school of thought which seems to have ended with C.D. Broad, who has our sympathy over Mill because of the doubt he expresses in our quotation in the preamble. Among the British Emergentists, the most modern views seem to have been those of S. Alexander, who, as paraphrased in O'Connor & Wong (2012), was committed to a view of emergence as

'the appearance of novel qualities and associated, high-level causal patterns which cannot be directly expressed in terms of the more fundamental entities and principles. But these patterns do not supplement, much less supersede, the fundamental interactions. Rather, they are macroscopic patterns running through those very microscopic interactions. Emergent qualities are something truly new (. . . ), but the world's fundamental dynamics remain unchanged.'

Alexander's idea that emergent qualities 'admit no explanation' and had 'to be accepted with the "natural piety" of the investigator foreshadowed the later notion of *explanatory emergence*. Indeed, philosophers distinguish between *ontological* and *epistemological* reduction or emergence, but ontological emergence seems a relic from the days of vitalism and other immature understandings of physics and (bio)chemistry (including the formation of chemical compounds, which Broad and some of his contemporaries still saw as an example of emergence in the strongest possible sense, i.e., falling outside the scope of the laws of physics). Recent literature, including the present chapter, is concerned with epistemological emergence, of which explanatory emergence is a branch. For example, Hempel wrote:

'The concept of *emergence* has been used to characterize certain phenomena as 'novel', and this not merely in the psychological sense of being unexpected, but in the theoretical sense of being unexplainable, or unpredictable, on the basis of information concerning the spatial parts or other constituents of the systems in which the phenomena occur, and which in this context are often referred to as "wholes".' (Hempel, 1965, p. 62)

See also Batterman (2002), Bedau & Humpreys (2008), Norton (2012), Silberstein (2002), Wayne & Arciszewski (2009), and many other surveys of emergence.

## §10.1. Spontaneous symmetry breaking: The double well

The facts we use about the double-well Hamiltonian may be found in Garg (2000) or Landau & Lifshitz (1977) at a heuristic level (but with correct conclusions), or, rigorously, in Reed & Simon (1978), Simon (1985), Helffer (1988), and Hislop & Sigal (1996). Theorem 10.2 is Theorem XIII.47 in Reed & Simon (1978).

#### §10.2. Spontaneous symmetry breaking: The flea

The flea perturbation and its effect on the ground state were first described in Jona-Lasinio, Martinelli, & Scoppola (1981a,b), who used methods from stochastic mechanics. See also Claverie & Jona-Lasinio (1986). Using more conventional methods, their results were reconfirmed and analyzed further by e.g. Combes, Duclos, & Seiler (1983), Graffi, Grecchi, & Jona-Lasinio (1984), Helffer & Sjostrand ¨ (1985), Simon (1985), Helffer (1988), and Cesi (1989). The "Flea on the Elephant" terminology used by Simon (1985) motivated the title of Landsman & Reuvers (2013), who, as will be explained in the next chapter, identified the proper host animal as a cat. All pictures in this section are taken from the latter paper (and were prepared by the second author). For the Eyring–Kramers formula see Berglund (2011) for mathematicians or Hanggi, Talkner, & Borkovec (1990) for physicists. ¨

#### §10.3. Spontaneous symmetry breaking in quantum spin systems

The translation-non-invariant ground states mentioned after Proposition 10.5 are discussed e.g. in Example 6.2.56 in Bratteli & Robinson (1997). See also Liu & Emch (2005), which was in important source for this section, and Ruetsche (2011) for a discussion of the definition of SSB through non-implementability. For order parameters see e.g. Sewell (2002), §3.3. A proof of Proposition 10.8 may be found in Bratteli & Robinson (1997), Proposition 6.2.15.

#### §10.4. Spontaneous symmetry breaking for short-range forces

The idea of SSB goes back to Heisenberg(1928). The C\*-algebraic approach in quantum spin systems with short-range forces is reviewed in Bratteli & Robinson (1997); see also Nachtergaele (2007). Theorem 10.10 is due to Araki (1974); see also Simon (1993), Theorem IV.5.6, and Bratteli & Robinson (1997), Theorem 6.2.18. In Definition 10.9, Araki required Ωω to be separating for πω(*A*) instead of ω to be α*t*-invariant, but in the presence of (10.53) and hence (10.53) these conditions are equivalent. The fact that (for short-range forces) global Gibbs states defined by (10.43) satisfy the KMS condition follows from Theorem 10.10, but this was the starting point of Haag, Hugenholtz, & Winnink (1967); see Winnink (1972).

Uniqueness of KMS states for one-dimensional quantum spin systems with shortrange forces at any positive temperature (which also holds for the classical case, e.g. the one-dimensional Ising model) has been proved by Araki (1975). See also Mattis (1965) and Altland & Simons (2010) for some of the underlying physical intuition.

#### §10.5. Ground state(s) of the quantum Ising chain

Theorem 10.11.1 was first established in Pfeuty (1970) by explicit calculation, based on Lieb, Schultz, & Mattis (1961). For more information on the quantum Ising model (also in higher dimension) see e.g. Karevski (2006), Sachdev (2011), Suzuki et al (2013), and Dutta et al (2015). Uniqueness of the ground state of the quantum Ising model with *B* = 0 holds in any dimension *d*, as first shown by Campanino, Klein & Perez (1991) on the basis of Perron–Frobenius type arguments similar to those for Schrodinger operators. The singular case ¨ *B* = 0 leads to a violation of the strict positivity conditions necessary to apply the Perron–Frobenius Theorem, and this case indeed features a degenerate ground state even when *N* < ∞.

The overall picture of SSB described in this section arose from the work of Horsch & von der Linden (1988), Kaplan, Horsch, & von der Linden (1989), Kaplan, von der Linden, & Horsch (1990), and especially Koma & Tasaki (1993, 1994). See also van Wezel (2007, 2008), van Wezel & van den Brink (2007), and Fraser (2016).

The analogy between the quantum Ising chain and the double-well potential may not be surprising physically, since the latter was originally derived from the former: in potassium dihydrogen phosphate, i.e. KH2PO4, each proton of the hydrogen bond would reside in one of the two minima of an effective double-well potential originating in the oxygen atoms, if it were not for tunneling, parametrized by the field *B*, which at small values yields a symmetric ground state (De Gennes, 1963).

## §10.6. Exact solution of the quantum Ising chain: *N* < ∞

The general set-up to this solution is due to Lieb, Schultz, & Mattis (1961), and was adapted to the quantum Ising by Pfeuty (1970), with further details by Karevski (2006). The complex solution *q*<sup>0</sup> was already noted by Lieb et al. The energy splitting in higher dimensions does not seem to be known, but Koma & Tasaki (1994, eq. (1.5)) expect similar behaviour as in *d* = 1.

#### §10.7. Exact solution of the quantum Ising chain: *N* = ∞

The solution described in this section is due to Araki & Matsui (1985), where further details may be found; this is a highlight of modern mathematical physics! Theorem 10.20 is due to Araki (1987), although such results have a long history going back to Shale & Stinespring (1964, 1965). For a very clear exposition see Ruijsenaars (1987). See also Evans & Kawahigashi (1998), Chapter 6.

The reason the one-sided chain Λ = N is problematic is that although the bosonic algebra ⊗*j*∈N*M*2(C) and its fermionic counterpart*CAR*(-<sup>2</sup>(N)) are well defined, and are isomorphic through the Jordan–Wigner transformation (10.102) - (10.103), the limiting dynamics has no simple form on either *A* or *F*, because the Fourier transform of -<sup>2</sup>(N) is the Hardy space *<sup>H</sup>*2(−π,π) of *<sup>L</sup>*2-functions with positive Fourier coefficents, instead of the usual *<sup>L</sup>*2(−π,π). Unlike on *<sup>L</sup>*2, The energies sgn*<sup>k</sup>* of the fermionic quasiparticles do not define a multiplication operator on *H*2.

#### §10.8. Spontaneous symmetry breaking in mean-field theories

The Poisson structure on *S*(*B*) was introduced by Bona (1988) and more generally by Duffield & Werner (1992a); see also Bona (2000). Theorem 10.22 and Corollary 10.23 are due to Duffield & Werner (1992a). The symplectic leaves of the given Poisson structure on *S*(*B*) (for which notion see e.g. Marsden & Ratiu (1994) or Landsman (1998a)) were determined by Duffield & Werner (1992a): Two states ρ and σ lie in the same symplectic leaf of S (*B*) iff ρ(*a*) = σ(*uau*∗) for some unitary *u* ∈ *B*. If ρ and σ are pure, this is the case iff the GNS-representations πρ (*B*) and πσ (*B*) are unitarily equivalent, cf. Thm. 10.2.6 in Kadison & Ringrose (1986). In general the implication holds only in one direction: if ρ and σ lie in the same leaf, then they have unitarily equivalent GNS-representations.

Notes 433

Our survey of equilibrium states of homogeneous mean-field models is based on Fannes, Spohn, & Verbeure (1980) and Bona (1989). For rigorous results on the Curie–Weiss model see Chayes et al (2008) and Ioffe & Levit (2013). Numerical evidence for the restoration of Butterfield's Principle may be found in Botet, Julien & Pfeuty (1982) and Botet & Julien (1982), which are up to *N* ∼ 150, and Vidal et al (2004), which reaches *N* = 1000. Note that experimental samples have *N* < 10.

In the context of the BCS model of superconductivity in the strong coupling limit), the Hamiltonian, *h*ˆ<sup>θ</sup> in (10.287) or *h*<sup>ω</sup> in (10.294) is called the *Bogoliubov– Haag Hamiltonian*, after Bogoliubov (1958) and Haag (1962). Further contributions to mean-field theories include Thirring & Wehrl (1967), Thirring (1968), Hepp (1972), Hepp & Lieb (1973), van Hemmen (1978), Rieckers (1984), Morchio & Strocchi (1987), Duffner & Rieckers (1988), Bona (1988, 1989, 2000), Unnerstall (1990a, 1990b), and Sewell (2002). For a nice proof of Theorem 10.25, which originates in Fannes, Spohn, &Verbeure (1980) and Bona (1989), see Gerisch (1993).

Even in the absence of a global KMS condition for ωˆ <sup>β</sup> , one is justified in interpreting the primary states (ω<sup>β</sup> <sup>θ</sup> )<sup>∞</sup> as pure thermodynamic phases of the given infinite quantum system, whose thermodynamics is described by the "phase space" *S*(*Mn*(C)). Though somewhat against the spirit of Bohrification (according to which the commutative C\*-algebra *C*(*Mn*(C)) is the right one to look at), the argument can be strengthened by enlarging *A* to *A*⊗*C*(*Mn*(C)) (where the choice of the tensor product does not matter, since *C*(*Mn*(C)) is commutative and hence nuclear, see §C.13). This larger C\*-algebra was introduced by Bona (1990), who proved:

Theorem 10.30. *1. There is a unique time-evolution* α *on A*⊗*C*(*Mn*(C)) *such that for any primary permutation-invariant state* ω *on A and a* ∈ *A one (strongly) has*

$$\lim\_{N \to \infty} \pi\_{\mathfrak{w}} \left( e^{ith\_{\Lambda\_N}} a e^{-ith\_{\Lambda\_N}} \right) = \pi\_{\mathfrak{w}}(\mathfrak{a}\_t(a)). \tag{10.397}$$

*2. The states* ωˆ <sup>β</sup> *and* ω<sup>β</sup> <sup>θ</sup> *in* (10.286)*, which are defined on A, extend to the tensor product A*⊗*C*(*Mn*(C)) *as* <sup>ω</sup><sup>ˆ</sup> <sup>β</sup> <sup>⊗</sup>μβ *and* <sup>ω</sup><sup>β</sup> <sup>θ</sup> ⊗δθ *, respectively, and as such satisfy the* KMS *condition at inverse temperature* β *with respect to the dynamics* α*.*

#### §10.9. The Goldstone Theorem

There is a large amount of literature on the Goldstone Theorem, both heuristic and rigorous. The former started with Goldstone, Salam, & Weinberg (1962), whereas the latter originates in Kastler, Robinson, & Swieca (1966); see also Buchholz et al (1992). For a survey, see Strocchi (2008, 2012), whose approach (based on Morchio & Strocchi, 1987) we follow. See also Berzi (1979, 1981), Landau, Perez, & Wreszinski (1981), Fannes, Pule, & Verbeure (1982), and Wreszinski (1987).

#### §10.10. The Higgs mechanism

The original reference is Higgs (1964ab). Our discussion is based on Lusanna & Valtancoli (1996ab) and Struyve (2011), both of whom derive the physical variables in the abelian Higgs model. See also Rubakov (2002), Strocchi (2008), where Theorem 10.29 may be found, and Stoltzner (2014) for some history and sociology. ¨

## Chapter 11 The measurement problem

The measurement problem of quantum mechanics was probably born in 1926:

'Thus Schrodinger's quantum mechanics gives a very definite answer to the question of the ¨ outcome of a collision; however, this does not involve any causal relationship. One obtains *no* answer to the question "what is the state after the collision," but only to the question "how probable is a specific outcome of the collision" (in which the quantum-mechanical law of [conservation of] energy must of course be satisfied). This raises the entire problem of determinism. From the standpoint of our quantum mechanics, there is no quantity that could causally establish the outcome of a collision in each individual case; however, so far we are not aware of any experimental clue to the effect that there are internal properties of atoms that enforce some particular outcome. Should we hope to discover such properties that determine individual outcomes later (perhaps phases of the internal atomic motions)? Or should we believe that the agreement between theory and experiment concerning our inability to give conditions for a causal course of events is some pre-established harmony that is based on the non-existence of such conditions? I myself tend to relinquish determinism in the atomic world. But this is [also] a philosophical question, for which physical arguments alone are not decisive.' (Born, 1926a, p. 866; translation by the author)

In other words, quantum mechanics stipulates that the state after some collision (or measurement) is ψ = ∑*<sup>n</sup> cn*ψ*n*, whereas experiment demonstrates that in fact the final state is just one of the ψ*n*, with (Born) probability |*cn*| 2. Quantum mechanics, then, seems unable to account for single outcomes of experiments and has to satisfy physicists with merely probabilistic predictions. This, in a nutshell, is the measurement problem—although very substantial analysis is needed to flesh it out.

Giving up determinism was soon incorporated in the Copenhagen Interpretation of Bohr and Heisenberg (cf. the Introduction) and more broadly became part of what might be called "*orthodoxy*", which represents the apparent (but not actual) consensus among Bohr, Heisenberg, Pauli, Born, Jordan, Dirac, von Neumann, and many others, which they supposedly reached around 1930 after the formal completion of quantum mechanics. This "orthodoxy", which later gave rise to the unfortunate "shut up and calculate" attitude most physicists seem to have (especially towards the measurement problem), should be distinguished from the Copenhagen Interpretation. For example, von Neumann never endorsed the doctrine of classical concepts, which in the above attitude has been replaced by the different and far more superficial idea that it is the entire goal of physics to explain experiments.

#### 11.1 The rise of orthodoxy

Even within the strict Copenhagen Interpretation, there were sharp differences between Bohr and Heisenberg, beyond the one concerning classical concepts reviewed in the Introduction. However, it seems that they agreed about the following point made by Bohr in his Como lecture concerning measurement:

'According to the quantum theory, just the impossibility of neglecting the interaction with the agency of measurement means that every observation introduces a new uncontrollable element.' (Bohr, 1928, p. 584)

This placed measurement squarely outside quantum mechanics for the second time: the first time was in the insistence that the measurement device ("if it is to serve its purpose") had to be described classically (cf. the Introduction), and now we also learn that the interaction between the quantum object undergoing measurement and the apparatus in question is "uncontrollable", *despite the fact that Bohr and Heisenberg regarded quantum mechanics as a complete theory*: their argument was apparently that precisely the classical nature of the apparatus makes the interaction uncontrollable. This in turn justified the classical description of the device, in that registration of a measurement result ought to be "objective", so that reading it out by performing a measurement on the apparatus, so to speak, should not introduce any further disturbance and hence uncontrollability (or so the argument goes).

Consistent with Bohr's point, a more detailed conceptual analysis of the measurement process was given by Heisenberg (1958, pp. 46–47, 54–55), who consistently refers to the quantum state or wave-function as the "probability function":

'Therefore, the theoretical interpretation of an experiment requires three distinct steps:


(. . . ) After [the] interaction [with the measuring device] has taken place, the probability function contains the objective element of tendency and the subjective element of incomplete knowledge, even if it has been a "pure case" before [i.e., it has become a mixture]. It is for this reason that the result of the observation cannot generally be predicted with certainty; what can be predicted is the probability of a certain result of the observation, and this statement about the probability can be checked by repeating the experiment many times. (. . . ) The observation itself [i.e., the act of registration of the result by the mind of the observer] changes the probability function discontinuously; it selects of all possible events the actual one that has taken place. Since through the observation our knowledge of the system has changed discontinuously, its mathematical representation also has undergone the discontinuous change and we speak of a "quantum jump."

Here we find the typical Copenhagen view of measurement as a two-step process:


Note that Heisenberg's last comment puts him squarely into the camp of what is now called "QBism" (i.e., *Quantum Bayesianism*, see §11.2 below)!

Von Neumann (1932, §VI.1) gave a more formal (and highly influential) presentation of the (alleged) two stages of the measurement process:

'In the discussion so far we have treated the relation of quantum mechanics to the various causal and statistical methods of describing nature. In the course of this we found a peculiar dual nature of the quantum mechanical procedure which could not be satisfactorily explained. Namely, we found that on the one hand a state φ is transformed into the state φ under the action of an energy operator H in the time interval 0 ≤ τ ≤ *t*:

$$\frac{\partial}{\partial \mathfrak{T}} \phi\_{\mathfrak{t}} = -\frac{2\pi i}{h} \mathsf{H} \phi\_{\mathfrak{t}} \; : \; 0 \le \mathfrak{t} \le \mathfrak{t}$$

so if we write <sup>φ</sup><sup>0</sup> <sup>=</sup> <sup>φ</sup>, <sup>φ</sup>*<sup>t</sup>* <sup>=</sup> <sup>φ</sup> then <sup>φ</sup> <sup>=</sup> *<sup>e</sup>*<sup>−</sup> <sup>2</sup>π*<sup>i</sup> <sup>h</sup> <sup>t</sup>*Hφ, which is purely causal. A mixture U is correspondingly transformed into

$$\mathsf{U}' = e^{-\frac{2\mathsf{q}\mathsf{u}\_{I\mathsf{H}}}{\mathsf{n}}\mathsf{H}}\mathsf{U}e^{+\frac{2\mathsf{q}\mathsf{u}\_{I\mathsf{H}}}{\mathsf{n}}\mathsf{H}}$$

Therefore, as a consequence of the causal change of φ into φ the [pure] states U = P[φ] [=|φφ|] go over into the [pure] states U = P[<sup>φ</sup>] (process 2 in V.1.). On the other hand, the state φ—which may measure a quantity with discrete spectrum, distinct eigenvalues and eigenfunctions φ1, φ2, . . .—undergoes in a measurement a non-causal change in which each of the states φ1, φ2, . . . can result, and in fact does result with the respective probabilities |φ,φ1|2, |φ,φ2|2,.... That is, the mixture

$$\mathbf{U}' = \sum\_{n=1}^{\infty} |\langle \boldsymbol{\phi}, \phi\_n \rangle|^2 \mathbf{P}\_{[\phi']}$$

obtains (...) (process 1 in V.1.). Since the [pure] states [i.e. P[φ]] go over into mixtures, the process is not causal. The difference between these two processes U → U is a very fundamental one: aside from their different behaviors in regard to the principle of causality, they are also different in that the former is (thermodynamically) reversible, while the latter is not.' (pp. 417–418 in von Neumann (1955); translation: R.T. Beyer)

All this concerns merely the first stage of the measurement, in which a pure state is transformed into a mixed one. The second stage, in which a single outcome is obtained, is already alluded to above (though clouded by von Neumann's ensemble language), but is described (in prose) later on through what is now called a *von Neumann chain*: one redefines system plus apparatus as the system, and couples it to a new apparatus, etc. This chain supposedly ends with the "ego" of the "individual" whose "intellectual inner life" is finally responsible for a single outcome.

It is very remarkable that von Neumann nowhere seems to use the central Copenhagen dogma that the apparatus be described classically (cf. the Introduction), especially since the mathematics of operator algebras he was inventing at almost exactly the same time is tailor-made for incorporating this dogma (which fact indeed forms the motivation for the present book). One clue for his lack of enthusiasm may come from the very end of his book (i.e., §VI.3), where he challenges 'an explanation often proposed to account for the statistical character of the process 1', namely the idea that (the non-unitary) process 1 might have its origin in an initial mixed state of the apparatus. Indeed, even if the apparatus as a quantum-mechanical system is in a pure state (as any system should be ontologically), its description as a classical system generally renders its state mixed—and the same conclusion may be drawn on epistemic grounds, arguing that the state of macroscopic or otherwise complicated systems cannot be known exactly. Many writings by the Copenhagen school, then, suggest that the alleged unanalyzable nature of the measurement and the randomness of its outcome should be attributed to the classical description of the apparatus and its ensuing mixed state, including our earlier quotation (cf. §8.4) from Heisenberg (1958) on the origin of probabilities in quantum mechanics:

'these uncertainties (. . . ) are simply a consequence of the fact that we describe the experiment in terms of classical physics' (Heisenberg, 1958, p. 53)

To counter this argument, von Neumann argues that physics requires the (Born) probabilities for the various outcomes to depend only on the initial state φ of the quantum system undergoing measurement (as opposed to the state of the apparatus, be it classical or quantum), whereas any "process 2" (i.e. unitary) time evolution would merely push the coefficients *wn* in the (alleged) mixed apparatus state into the role of probabilities for the possible outcomes. However, 'the *wn* are characteristic of the observer alone (and therefore independent of φ)', and hence

'the non-causal nature of the process 1. is not produced by any incomplete knowledge of the state of the observer.' (von Neumann, 1955, p. 439).

Von Neumann's argument became the mother of all "insolubility theorems" for the measurement problem, some of which will be reviewed in §11.3 below.

Pauli (1933, §9) also includes some comments on measurement and the interpretation of quantum mechanics in general. These display a bizarre hybrid between the ideas of Bohr and von Neumann, somehow mediated by Heisenberg. Thus Pauli endorses (even starts with) some notion of Complementarity, but he relates this to the mathematical formalism rather than to the doctrine of classical concepts (which he nowhere invokes). Similarly, his treatment of measurement on the one hand follows the disturbance ideology of Bohr and Heisenberg (but without grounding this in the classical description of the apparatus), whilst technically he quotes and follows von Neumann, claiming that measurement leads to mixtures which subsequently reduce to one term through '*ein besonderer, naturgesetzlich nicht im Voraus determinierter Akt*' (i.e., special process that does not follow deterministic laws of nature). A rather more systematic review of early measurement theory was written by London & Bauer (1939), whose opening is highly promising and almost poetic:

'The majority of introductions to quantum mechanics follow a rather dogmatic path from the moment that they reach the statistical interpretation of the theory. In general they are content to show, by more or less intuitive considerations, how the actual measuring devices always introduce an element of indeterminism, as this interpretation demands. However, care is rarely taken to verify explicitly that the formalism of the theory, applied to that special process which constitutes the measurement, truly implies a transition of the system under study to a state of affairs less fully determined than before. A certain uneasiness arises. One does not see exactly with what right and up to what point one may, in spite of this loss of determinism, attribute to the system an appropriate state of its own. Physicists are to some extent sleepwalkers, who try to avoid such issues and try to concentrate on concrete problems. But it is exactly these questions of principle which nevertheless interest nonphysicists and all who wish to understand what modern physics says about the analysis of the act of observation itself.' (London & Bauer, 1939, pp. 218-219)

#### Yet the authors mainly repeat von Neumann's analysis (confirming its lofty status):

'The interaction with the apparatus does not put the object into a new pure state. Alone, it does not confer to the object a new wave function. On the contrary, it actually gives nothing but a statistical mixture: It leads to one mixture for the object and one mixture for the apparatus. For either system regarded individually there results uncertainty, incomplete knowledge. Yet nothing prevents our reducing this uncertainty by further observation.

And this is our opportunity. So far we have only coupled one apparatus with one object. But a coupling even with a measuring device is not yet a measurement. A measurement is achieved only when the position of the pointer has been observed. It is precisely the increase of knowledge, acquired by the observation, that gives the observer the right to choose among the different components of the mixture predicted by the theory, to reject those which are not observed, and to attribute thenceforth to the object a new wave function, that of the pure case which he has found. We note the essential role played by the consciousness of the observer in this transition from the mixture to the pure state. Without his effective intervention, one would never obtain a new ψ function.' (*ibid.*, p. 251)

Accordingly, at the end of the golden era of quantum mechanics, the view of measurement as a two-stage process in which a pure state is first transformed into a mixture in a more or less scientific way, upon which unanalyzable and possibly mental phenomena bring about a single outcome, was firmly established, although—the point deserves to be repeated—in their formal treatments neither von Neumann nor London & Bauer incorporated the key claim Bohr and Heisenberg made about measurement, namely that the corresponding apparatus *must* be described classically.

Opponents of the Copenhagen Interpretation (the most prominent among whom were Einstein and Schrodinger) were well aware of this tension between formalism ¨ and ideology, which in the form of *Schrodinger's Cat ¨* even reached immortality (!):

'One may also construct highly burlesque cases. A cat is confined in a box of steel together with the following hellish machine (which one should secure against a direct attack by the cat): A Geiger counter contains a tiny amount of radioactive material, *so* little that during one hour *possibly* one of its atoms decays, but equally likely also none does; if it does, then the counter is triggered and activates, via a relais, a little hammer which breaks a small container of hydrocyanic acid. Having left this system to itself for one hour, one will say that the cat is still alive *if* meanwhile no atom has decayed. The first decay of an atom would have poisoned her. The ψ-function of the entire system would express this in such a way that in it the living and the dead cat would be mixed or spread out on equal terms. What is typical about these cases is that an uncertainty which is originally limited to the atomic domain has been transformed into a coarse-grained uncertainty, which may then be *decided* by direct observation. This prevents us from regarding a "faded model" as an image of reality in such a naive way. As such [this model] contains nothing that is unclear or contradictory. There is a difference between a moved or poorly focused photograph and a record of clouds and fog banks.' (Schrodinger, 1935, p. 812; translation by the author) ¨

The last sentence is particularly powerful, contrasting Schrodinger's (as well as Ein- ¨ stein's) view that physics should describe some sharply defined reality (of which quantum mechanics at best produces blurred pictures) with the Copenhagen view, according to which reality itself lacks focus (with quantum mechanics providing the best possible picture of it). This contrast confirms our idea that Schrodinger's Cat ¨ metaphor specifically draws attention to the problems that arise from the Copenhagen "duality postulate" that macroscopic systems (such as measurement devices and cats) admit both a classical and a quantum-mechanical description.

#### 11.2 The rise of modernity: Swiss approach and Decoherence

Despite Schrodinger's Cat, the measurement problem was not an active field of re- ¨ search until Wigner (1963) rekindled interest in the topic. Even so, his paper mainly reiterated von Neumann's views—which already had been repeated by London and Bauer—including his omission of the doctrine of classical concepts. In particular, it continued to promulgate the suggestion that measurement is a two-step process for which the clarification of the first step (i.e. of turning a pure state into a mixture) would already be a major part of the solution of the measurement problem.

Wigner's paper inspired for example the "*'*Swiss" approach to the measurement problem, which was remarkable in being the first serious mathematical attempt to take into account the Bohr–Heisenberg dogma that the apparatus be described classically, whilst also paying tribute to von Neumann in insisting on mathematical rigour. Indeed, the Swiss approach relies on the formalism of operator algebras, which also marks a conceptual break with all earlier—and indeed most later—approaches in taking the observables rather than the states as a starting point. The aim of the Swiss approach is to show that relative to a suitable class of observables, the pure state

$$\rho = |\Psi\rangle\langle\Psi|,\ \Psi = \sum c\_n \Psi\_n,$$

coincides with the corresponding mixture without the off-diagonal terms, i.e.,

$$\rho' = \sum\_{n} |c\_n|^2 |\Psi\_n\rangle\langle\Psi\_n|.$$

Thus the ambition of this approach is limited, in that no attempt is made to explain (at least the appearance of) single outcomes, except by appealing to the ignorance interpretation of probability (in vain, see below). The alleged equivalence between pure states and mixtures can typically be achieved if the apparatus is infinite and the measurement time is infinite, too. The infinite character of the apparatus (here seen as an idealization of a macroscopic device, as is standard in quantum statistical mechanics), is no guarantee for its classicality, but it is certainly a step in the right direction (cf. Chapter 8). Thus two closely related problems must be overcome:


To explain the last point, we quote Leggett (though somewhat out of context):

'Now, following Schrodinger, let us consider a thought experiment in which the quantum- ¨ mechanical description of the final state, as obtained by appropriate solution of the time dependent Schrdinger equation, contains simultaneously nonzero probability amplitudes for two or more states of the universe that are, by some reasonable criterion, macroscopically distinct (in Schrodingers example, this would be "cat alive" and "cat dead"). Of course, just ¨ about everyone, including me, would accept that because of, inter alia, the effects of decoherence, it is likely to be impossible, at least for the foreseeable future, to experimentally demonstrate the interference of such states. (On the other hand, as the late John Bell was fond of pointing out, the foreseeable future is not a very well-defined concept. In fact, as late as 1999, not a few people were confidently arguing that because of the inevitable effects of decoherence, the projected experiments to demonstrate interference at the level of flux qubits would never work. In this case, the foreseeable future lasted approximately one year. As Bell used to emphasize, the answers to fundamental interpretive questions should not depend on the accident of what is or is not currently technologically feasible.) But the crucial point is that the formalism of quantum mechanics itself has changed not one whit between the microscopic and macroscopic levels. Are we then entitled to embrace, at the macrolevel, an interpretation that was forbidden at the microlevel, simply because the evidence against it is no longer available? I would argue very strongly that we are not, and would therefore draw the conclusion: also at the macrolevel, when the quantum-mechanical description assigns simultaneously nonzero [probabilities] to two or more macroscopically distinct possibilities, then it is not the case that each system of the relevant ensemble realizes either one possibility or the other.' (Leggett, in Schlosshauer, 2011, p. 155)

This argument of Leggett's (which is a special case of Earman's Principle) was originally targeted at decoherence, but it also applies *verbatim* to the Swiss approach (which is closely related to decoherence, as both heavily rely on limits and superselection rules—which are absolute in the former and dynamically induced in the latter). In an even earlier hunch of Earman's Principle, Bell— this time aiming directly at the Swiss approach—in fact made a related point about its reliance on the *t* → ∞ limit (in that even at extremely large but finite time the state remains pure).

Jumping to the modern era, a striking point of continuity with the 1920s and 1930s is the idea that the measurement procedure (and hence the measurement problem) consists of two stages; only the terminology and the scope have changed:

'There are two distinct measurement problems in quantum mechanics: what Pitowsky has called a "big" measurement problem and a "small" measurement problem. The "big" measurement problem is the problem of explaining how measurements can have definite outcomes, given the unitary dynamics of the theory: it is the problem of explaining how individual measurement outcomes come about dynamically. The "small" measurement problem is the problem of accounting for our familiar experience of a classical, or Boolean, macroworld, given the non-Boolean character of the underlying quantum event space: it is the problem of explaining the dynamical emergence of an effectively classical probability.' (Bub, in Schlosshauer, 2011, pp. 145–146)

Clearly, the "small" measurement problem is modern parlance for the problem how to turn a superposition into a mixture, upon which the "big" problem—if it is noticed at all—still concerns the old issue of selecting *one* term from this mixture.

Furthermore, the measurement problem seems to have acquired increased scope and importance, as exemplified by the following quotations:

'One of the most ancient philosophical questions (Heidegger thought is was *the* question) is this: why is there something rather than nothing? In terms of events rather than substances, the question would be: how come anything happens at all? That question is the measurement problem.' (Fine, in Schlosshauer, 2011, p. 146)

'The measurement problem has been called "the reality problem" by Philip Pearle. This is a better name for it. We perceive objects in the world as being in definite states. A door is either open or shut, a given ball either is in a given box or it is not. The wave function, however, can have superpositions of these things, suggesting that the door can be simultaneously open and shut at the same time, and that the ball can be both in the box and not in the box at the same time. The reality problem is that there is a discrepancy between the version of reality we perceive, and the version presented to us by the most obvious interpretation of the wave function.' (Hardy, in Schlosshauer, 2011, p. 153)

'Fundamentally, the measurement problem is the problem of connecting probability with truth in the quantum world, that is to say, it is the problem of how to relate quantum probabilities to the objective occurrence and non-occurrence of events. The problem arises because there appears to be a difficulty in reconciling the objectivity of a particular measurement outcome with the entangled state at the end of a measurement.' (Bub, *ibid.*, p. 145)

More technically, the measurement problem has come to be seen as a special case of the problem of explaining at least the *appearance* of the classical world from quantum theory. If the measurement problem is seen from the Copenhagen perspective this is eminently reasonable, as both problems involve the dual description of either the apparatus or the world around us as both classical and quantum (and its possible failure). In this context, an alleged solution to the "small" problem, such as Decoherence, is often also seen as this explanation (as if there were no issue about the derivation of the laws of classical physics, including the dynamical ones).

*A propos*, another characteristic feature of the modern era is undoubtedly the dominance of *Decoherence* (if only over the Swiss approach), for example:

'I think the whole discussion about whether measurements in quantum mechanics are indeed problematic somewhat misses the point. Measurement interactions are only one of many examples of quantum interactions that lead to superpositions of macroscopically distinct states. Nature has been producing macroscopic superpositions for millions of years, well before any quantum physicist cared to artificially engineer such a situation. The key concept here is decoherence. Environmental interactions tend to produce superpositions of classically distinct states. This raises the issue of how one could describe a classical regime in quantum mechanics, quite irrespective of the existence of measuring apparatuses. (...)

If decoherence and its applications had been developed early in the history of quantum theory, then the idea that measurements play a special role in the theory might not have risen to such prominence, and the foundations of quantum mechanics would have focused instead on the problem of how to derive a classical regime within the theory.' (Bacciagaluppi, in Schlosshauer, 2011, p. 143)

Mathematically, decoherence boils down to the idea of adding one more link to the von Neumann chain (see §11.1) beyond *S*+*A* (i.e. the system and the apparatus). Conceptually, however, there is a fundamental conceptual as well as technical difference between Decoherence and older approaches that took such a step: whereas previously (e.g., in the hands of von Neumann, London & Bauer, and Wigner) the chain *converged towards the observer*, in Decoherence it *diverges away from the observer*. Namely, the third and final link is now taken to be the *environment*.

This notion is often taken in a fairly literal sense in agreement with the intuitive meaning of the word, but it may also (we would even say: preferably) refer to internal degrees of freedom of the apparatus, as in the Spehner–Haake model in §11.4. Either way, the "environment" is usually treated as an infinite system (necessitating a limit like *N* → ∞), which (in simple models where the pointer has discrete spectrum) has the consequence that the post-measurement state ∑*<sup>n</sup> cn*ψ*<sup>n</sup>* ⊗ φ*<sup>n</sup>* ⊗ χ*<sup>n</sup>* (in which the χ*<sup>n</sup>* are mutually orthogonal) is only reached not only in the limit *N* → ∞ of infinitely many degrees of freedom but also in the limit *t* → ∞ of infinite time. In that case, the restriction of the above state to *S*+*A* (i.e. the trace of the corresponding density operator over the degrees of freedom of the environment) is mixed, which means that the quantum-mechanical interference between the states ψ*<sup>n</sup>* ⊗φ*<sup>n</sup>* for different values of *n* has become "delocalized" to the environment, and accordingly is deemed irrelevant if the latter is not observed (i.e. omitted from the description).

Unfortunately, in so far as it claims to provide a solution to the measurement problem, Decoherence is an unmitigated disaster:


Thus Decoherence is parasitic on some interpretation of quantum mechanics that solves the measurement problem, which in turn is typically strengthened by it. In this context, the most popular of these has been the Everett (i.e., Many-Worlds) Interpretation, which, after decades of obscurity or even derision, suddenly started to be greeted with a flourish of trumpets in the wake of the popularity of Decoherence. However, even if such extravagant interpretations are coherent, these should in our opinion be a very last resort, acceptable only if truly everything else has failed.

On the positive side, Decoherence has led to the important idea of *einselection* (for *environment-induced superselection*), where a pure state ψ of some system (possibly plus apparatus) is "einselected" if it remains pure after coupling to the environment and subsequent restriction. The hope (or rather program), then, is to show that classical states are classical precisely because they are robust in this way.

Finally, it may be appropriate to close this historical introduction to the measurement problem by mentioning another modern approach, namely outright *denial*:

'I remember giving a talk at a meeting at the London School of Economics seven or so years ago. In the audience was an Oxford philosophy professor, and I suppose he didn't much like my brash cowboy dismissal of a good bit of his life's work. When the question session came around, he took me to task with the most proper and polite scorn I had ever heard (I guess that's what they do). "Excuse me. You seem to have made an important point in your talk, and I want to make sure that I have not misunderstood anything. Are you saying that you have solved the measurement problem? This problem that has plagued quantum mechanics for seventy-five years? The message of your talk is that, using quantum information theory, you have finally solved it?" (Funny the way the words could be put together as a question, but have no intended usage but as a statement.) I don't know that I did anything but turn the screw on him a bit further, but I remember my answer. "No, not me; I havent done anything. What I am saying is that a "measurement problem" never existed in the first place. (...)

The "measurement problem" is purely an artefact of a wrong-headed view of what quantum states and/or quantum probabilities ought to be. (...) quantum states are not real things from a Quantum Bayesian view (. . . ) but a personal judgment, a quantified degree of belief. A quantum state is a set of numbers an agent uses to guide the gambles he might take on the consequences of his potential interactions with a quantum system. It has no more substantiality than that. Aren't epistemic states real things? Well . . . yes, in a way. They are as real as the people who hold them. But no one would consider a person to be a property of the quantum system he happens to be contemplating. And one shouldn't think of a quantum state in that way either—one shouldnt think of it as a property of the quantum system to which it is assigned. Take the source of the paradox away, we say, and the paradox itself will go away.' (Fuchs, in Schlosshauer, 2011, pp. 146–147)

These words have been quoted at some length, because the view that "physics is information" and its alleged corollary that all foundational problems are solved by Bayesian reasoning (perhaps with a quantum flavour) is becoming increasingly popular. Physicist are now seen as punters (or, in academic parlance, "agents") who in smoky offices bet on the outcomes of experiments, and hence use (quantum) Dutch Book arguments to justify some sort of strictly epistemic (quantum) probability calculus. However, the ideology of "*QBism*" thus expressed appears to have adopted precisely the weakest ingredients of the Copenhagen Interpretation—viz. the idea that the wave-function is just a catalogue of the probabilities for possible outcomes of measurements whose details are supposedly beyond our grasp, cf. the Introduction—at the expense of its one strong component, namely the doctrine of classical concepts. Although there may have been pragmatic reasons for this attitude in the 1920s, (mathematical) physics has moved forward since then, enabling much more detailed analysis and hence justifying considerably greater ambition in understanding the measurement process than Bohr and Heisenberg *cum suis* had.

In any case, the fact that one competent author regards the measurement problem as the key to reality whilst another flatly denies even its very existence should give pause for thought. As in the Bohr–Einstein debate, different perspectives on reality and on the task of physics seem to play a role here, culminating in contrasting views of quantum-mechanical states: the more "reality" one attributes to states, the more serious the measurement problem is. Or, contrapositively, the more operationalist one's attitude, the further the problem disappears behind the horizon.

#### 11.3 Insolubility theorems

Since in §11.4 we will "propose the impossible", namely miraculously solving the measurement problem within unitary quantum mechanics, it is helpful to review the arguments why this is generally felt to be impossible. Such arguments take the form of so-called *insolubility theorems*. As already mentioned, such theorems ultimately go back to von Neumann: especially those that prove the impossibility of explaining his process 1 (i.e. the transition from a pure state to a mixture) from process 2 (unitary time evolution according to the Schrodinger equation). Another kind of ¨ insolubility theorem shows that single outcomes are impossible from process 2.

It might be argued that both kinds of theorem add little to the basic mathematical intuition behind the measurement problem, which is as follows (it goes without saying that we disagree with this traditional description of measurement, see below). Let *s* ∈ *B*(*HS*) be the observable being measured (where *HS* is some Hilbert space associated to a quantum object *S* undergoing measurement) and let *a* ∈ *B*(*HA*) be a "pointer observable" correlated to *S* (where *HA* is a second Hilbert space). In particular, the measurement apparatus *A* is described quantum mechanically. For the moment we assume both Hilbert spaces to be finite-dimensional and both operators to be non-degenerate, even having the same spectrum {λ1,...,λ*n*}; this of course implies that dim(*HS*) = dim(*HA*) = *n*. Thus *HS* has a basis (υ(*s*) *<sup>i</sup>* ) of eigenvectors of *s* and likewise *HA* has a basis (υ(*a*) *<sup>i</sup>* ) of eigenvectors of *<sup>a</sup>*, with *<sup>s</sup>*υ(*s*) *<sup>i</sup>* <sup>=</sup> <sup>λ</sup>*i*υ(*s*) *<sup>i</sup>* and *a*υ(*a*) *<sup>i</sup>* <sup>=</sup> <sup>λ</sup>*i*υ(*a*) *<sup>i</sup>* (*i* = 1,...,*n*). The (erroneous) argument, then, is as follows:

1. Measurement should establish a correlation between values of *s* of *S* and values of *a* of *A*, which with the above labeling implies that for each *i* the initial system state υ(*s*) *<sup>i</sup>* should push the pointer from some initial state <sup>ψ</sup>(*A*) <sup>0</sup> into a final (post-measurement) state υ(*a*) *<sup>i</sup>* . Hence the dynamics, described by some unitary operator *u* ∈ *B*(*HS* ⊗*HA*), should be such that

$$
\mu(\mathfrak{v}\_i^{(s)} \otimes \mathfrak{w}\_0) = \mathfrak{v}\_i^{(s)} \otimes \mathfrak{v}\_i^{(a)} \equiv \mathfrak{q}\_i. \tag{11.1}
$$


$$
\langle u | \Psi\_0^{(S)} \rangle \langle \Psi\_0^{(S)} | u^\* = |\Phi\rangle\langle\Phi| \neq \sum\_i |c\_i|^2 |\Phi\_i\rangle\langle\Phi\_i|. \tag{11.2}
$$

As we already discussed, for some authors the measurement problem is the clash between nos. 1 and 3 (this is the "small" problem), whereas for others it is the conflict between nos. 1 and 2 (i.e. the "big" one). Either way, the goal of insolubility theorems is to show that the problem is not a consequence of idealizations in primitive arguments like the one just given, but remains even under very general assumptions. In particular, both the purity of the initial system as well as apparatus states (and hence of their tensor product), and the exact system-apparatus correlation assumed (including the premise of point spectra and finite-dimensional Hilbert spaces), can be considerably relaxed. To illustrate the kind of discussion, we present one example of an insolubility proof along the former lines and one along the latter. These proofs even remain valid if the notion of an observable itself is relaxed, too, namely from a self-adjoint operator to a POVM (see (2.178)), but we will not discuss this utmost generality (if only because it would not circumvent our critique below). It should be noted that insolubility theorems tacitly assume that the mathematical objects in the quantum-mechanical formalism describe all there is physically.

In the first direction, we have Theorem 11.2 below, which we may summarize as the *problem of statistics*: there is a contradiction between the following postulates:


Here the second and third postulates may be consequences of the first, but even so it is useful to list them separately, since denying or circumventing nos. 1, 2, and 3 is typically done in completely different ways (see the end of this section).

Formally, let *s* = *s*<sup>∗</sup> ∈ *B*(*HS*) be an arbitrary self-adjoint operator on an arbitrary (separable) Hilbert space *HS*, with associated spectral projections *e* (*s*) <sup>Δ</sup> ∈ P(*HS*), Δ ⊂ σ(*s*), and likewise *a* ∈ *B*(*HA*). It is convenient (and entails no genuine loss of generality) to still assume that σ(*s*) = σ(*a*). Recall that the Born measure μ(*s*) <sup>ρ</sup>*<sup>S</sup>* on the spectrum σ(*s*) induced by some density operator ρ*<sup>S</sup>* ∈ D(*HS*) is given by

$$
\mu\_{\mathcal{PS}}^{(s)}(\Delta) = \text{Tr}\left(\rho\_{\mathcal{S}} e\_{\Delta}^{(s)}\right) = \mathfrak{o}\_{\mathcal{S}}\left(e\_{\Delta}^{(s)}\right) = \mu\_{\mathfrak{o}\_{\mathcal{S}}}^{(s)}(\Delta),\tag{11.3}
$$

cf. (4.9), where ω*<sup>S</sup>* is the state associated to ρ*<sup>S</sup>* by (2.33), and no notational confusion between μ(*s*) <sup>ρ</sup>*<sup>S</sup>* and <sup>μ</sup>(*s*) <sup>ω</sup>*<sup>S</sup>* should arise (they are the same thing). Likewise for *a*.

Definition 11.1. *1. Let H be a Hilbert space and let b* ∈ *B*(*H*)sa*. Two (normal) states* <sup>ω</sup>,<sup>ω</sup> *on B*(*H*) *are called* b-distinguishable *if* <sup>μ</sup>(*b*) <sup>ω</sup> <sup>=</sup> <sup>μ</sup>(*b*) <sup>ω</sup> *; in other words,*

	- 1*HS* ⊗*a -distinguishability of the two states u*(ρ*<sup>S</sup>* ⊗ρ*A*)*u*<sup>∗</sup> *and u*(ρ *<sup>S</sup>* ⊗ρ*A*)*u*∗*.*

For example, in case of a discrete spectrumf or simplicity, if λ<sup>1</sup> = λ<sup>2</sup> in σ(*b*), then any two unit eigenvectors υ(*b*) *<sup>i</sup>* (*i* = 1,2) give rise to *b*-distinguishable vector states <sup>ρ</sup>*<sup>i</sup>* <sup>=</sup> <sup>|</sup>υ(*b*) *<sup>i</sup>* υ(*b*) *<sup>i</sup>* <sup>|</sup>. If <sup>ψ</sup> <sup>=</sup> *<sup>c</sup>*1υ(*b*) <sup>1</sup> <sup>+</sup> *<sup>c</sup>*2υ(*b*) <sup>2</sup> with |*c*1| <sup>2</sup> <sup>+</sup> <sup>|</sup>*c*2<sup>|</sup> <sup>2</sup> <sup>=</sup> 1 and *<sup>c</sup>*<sup>1</sup> <sup>=</sup> <sup>0</sup>,1, then also the trio (ρ1,ρ2, *e*ψ) is pairwise *b*-distinguishable. If, the other hand, λ ∈ σ(*b*) is degenerate, then *e*<sup>ψ</sup> and *e*<sup>ψ</sup> fail to *b*-distinguishable whenever ψ,ψ ∈ *H*<sup>λ</sup> .

Clause 2 of Definition 11.1—which incorporates a vast number of at least theoretical scenario's—is a considerable weakening of the scheme (11.1), while clause 3 sharpens the second, implying that measurement transfers all Born probabilities for the object to the apparatus, probabilistically making the latter a mirror image of the former. Clause 4 firstly takes care of continuous spectra; if σ(*a*) is discrete, one may simply partition it by its points (a partition of σ(*a*) is sometimes called a *reading scale*). The "objectification" terminology is questionable (if not outright misleading), as it is motivated by the ignorance interpretation of mixtures (see below), but we follow the literature in using it. In what follows, we exclude the trivial cases where σ(*s*) consist of a single point, and/or σ(*a*) is partitioned by itself.

Theorem 11.2. *For any nontrivial object observable s and partitioning of* σ(*a*)*, there exists no measurement scheme* (ρ*A*,*u*) *for s whose final state u*(ρ*<sup>S</sup>* ⊗ρ*A*)*u*<sup>∗</sup> *objectifies a for any initial system state* ρ*<sup>S</sup> (let alone one that preserves probabilities).*

*Proof.* Since we will not use this theorem (except for pointing out that it attacks a straw man), we just prove it in the special case where σ(*a*) is discrete and partitioned by its points, and also the spectral decomposition ρ*<sup>A</sup>* = ∑*<sup>n</sup> pnen* of the initial apparatus state is unique, cf. (B.490). For any unit vector in <sup>υ</sup>(*s*) <sup>∈</sup> *HS* we then have

$$
\mu(e\_{\mathfrak{v}^{(s)}} \otimes \mathfrak{p}\_A)\mu^\* = \sum\_n p\_n \mu(e\_{\mathfrak{v}^{(s)}} \otimes e\_n)\mu^\*.\tag{11.4}
$$

Take <sup>λ</sup><sup>1</sup> <sup>=</sup> <sup>λ</sup><sup>2</sup> in <sup>σ</sup>(*s*), with associated eigenvectors <sup>υ</sup>(*s*) <sup>1</sup> and <sup>υ</sup>(*s*) <sup>2</sup> . If *en* = |α*n*α*n*|, for unit vectors α*<sup>n</sup>* ∈ *HA*, then objectification of *a* requires that each of the vectors

$$
\mu(\mathfrak{v}\_1^{(s)} \otimes \mathfrak{a}\_n), \ \mu(\mathfrak{v}\_2^{(s)} \otimes \mathfrak{a}\_n), \ \mu((c\_1 \mathfrak{v}\_1^{(s)} + c\_2)\mathfrak{v}\_2^{(s)} \otimes \mathfrak{a}\_n),
$$

with |*c*1| <sup>2</sup> <sup>+</sup>|*c*2<sup>|</sup> <sup>2</sup> <sup>=</sup> 1 and *<sup>c</sup>*<sup>1</sup> <sup>=</sup> <sup>0</sup>,1, must be an eigenvector of 1*HS* <sup>⊗</sup>*a*. This is only possible if the first two vectors (and hence the third) lie in the same eigenspace for 1*HS* ⊗*a*, but in that case condition no. 2 in Definition 11.1 is violated, since the three given initial system states are pairwise *s*-distinguishable whereas the corresponding outcomes states just listed evidently fail to be 1*HS* ⊗*a*-distinguishable. -

Insolubility theorems of the second kind describe the *problem of outcomes*, according to which clauses 1., 2., and 3. of the problem of statistics also contradict:

#### *4'. Measurements have determinate outcomes.*

Technical statements to this effect are even more straightforward than those formalizing the problem of statistics. We keep *HS* and *s* ∈ *B*(*HS*) as they were, but this time, *HA* may refer to the rest of the Universe outside the quantum object described by *HS* (which includes the pointer, of course). Here is the key assumption.

Definition 11.3. *Let s* ∈ *B*(*HS*)sa *be an object observable with partition* σ(*s*) = *<sup>i</sup>*∈*<sup>I</sup>* Δ*<sup>i</sup> of its spectrum (if* σ(*s*) = {λ1,...} *is discrete, one may take* Δ*<sup>i</sup>* = {λ*i*}*), and let HA be a second Hilbert space. A* sound measurement scheme *consists of:*

• *A collection* (*Si*)*i*∈*<sup>I</sup> of* outcome spaces*, i.e. subsets of the (normal) state space,*

$$\mathcal{S}\_{\delta} \subset \mathcal{S}\_{n}(H\_{\mathcal{S}} \otimes H\_{\mathcal{A}}) \cong \mathcal{Q}(H\_{\mathcal{S}} \otimes H\_{\mathcal{A}}),\tag{11.5}$$

*for which there is* 0 ≤ η < 1/2 *such that for i* = *j, one has*

$$2\sqrt{1-\eta} \le \|\mathbf{o}\_l - \mathbf{o}\_j\| \le 2 \ (\mathbf{o}\_l \in \mathcal{S}\_l, \mathbf{o}\_j \in \mathcal{S}\_j).\tag{11.6}$$

• *A pair* (ρ*A*,*u*)*, where* ρ*<sup>A</sup> is a density operator on B*(*HA*) *and u is a unitary on HS* <sup>⊗</sup>*HA, such that for each i* <sup>∈</sup> *I and each unit vector* <sup>υ</sup>(*s*) *<sup>i</sup>* ∈ *H*Δ*<sup>i</sup> (i.e., e*Δ*<sup>i</sup>* υ(*s*) *<sup>i</sup>* = υ(*s*) *<sup>i</sup> ), the state u*(*e* υ(*s*) *i* ⊗ρ*A*)*u*<sup>∗</sup> *(i.e. the outcome of the measurement) lies in Si.*

In (11.6) the first bound (which for small η is ≈ (2 − η) ≤ ···) is the key one, as the last one ≤ 2 is always satisfied and has been included for clarity. In particular,

$$\|\|\boldsymbol{\varrho}\_{l} - \boldsymbol{\varrho}\_{j}\|\| > \sqrt{2}.\tag{11.7}$$

Note that (11.6) implies that the *Si* must be disjoint, since assuming ω ∈ *Si* gives ω − ω*j* ≥ 2 <sup>√</sup>1−<sup>η</sup> for all <sup>ω</sup>*<sup>j</sup>* <sup>∈</sup> *Sj*, whereas <sup>ω</sup> <sup>∈</sup> *Sj* allows one to take <sup>ω</sup>*<sup>j</sup>* <sup>=</sup> <sup>ω</sup> in this inequality, leading to the contradiction 0 ≥ 2 <sup>√</sup>1−η. Note that in terms of density operators we have

$$\|\|\boldsymbol{\varrho}\_{l} - \boldsymbol{\varrho}\_{j}\|\| = \|\|\boldsymbol{\rho}\_{l} - \boldsymbol{\rho}\_{j}\|\|\_{1},\tag{11.8}$$

where ω*i*(*a*) = Tr(ρ*ia*), cf. (B.481) and Theorem B.146. If ω*<sup>i</sup>* and ω*<sup>j</sup>* are pure, induced by unit vectors ψ*<sup>i</sup>* and ψ*<sup>j</sup>* in *HS* ⊗ *HA*, then by (C.637), eq. (11.6) comes down to

$$0 \le |\langle \Psi\_i, \Psi\_j \rangle|^2 \le \eta. \tag{11.9}$$

For example, in the von Neumann measurement scheme (11.1), the subspace *Si* just consist of the vector state defined by υ(*s*) *<sup>i</sup>* <sup>⊗</sup>υ(*a*) *<sup>i</sup>* , hence (11.6) holds with η = 0.

Theorem 11.4. *For any nontrivial object observable s and partitioning of* σ(*s*)*, any sound measurement scheme* ((*Si*),η,ρ*A*,*u*) *admits initial states* υ ∈ *HS such that u*(*e*<sup>υ</sup> ⊗ρ*A*)*u*<sup>∗</sup> *(i.e. the post-measurement state) does not lie in any outcome space Si.* *Proof.* Let υ = (υ*<sup>i</sup>* +υ*j*)/ √ 2, where *i* = *j* and for the moment υ*<sup>i</sup>* and υ*<sup>j</sup>* are merely orthonormal vectors in *HS*. For each *i* = 1,2 we then compute:

$$\begin{aligned} \left\| \mu(\boldsymbol{e}\_{\boldsymbol{\nu}} \otimes \boldsymbol{\rho}\_{\boldsymbol{A}}) \boldsymbol{u}^{\*} - \mu(\boldsymbol{e}\_{\boldsymbol{\nu}} \otimes \boldsymbol{\rho}\_{\boldsymbol{A}}) \boldsymbol{u}^{\*} \right\|\_{1}^{(H\_{\mathcal{S}} \otimes H\_{\boldsymbol{A}})} &= \left\| \boldsymbol{e}\_{\boldsymbol{\nu}} \otimes \boldsymbol{\rho}\_{\boldsymbol{A}} - \boldsymbol{e}\_{\boldsymbol{\nu}\_{\boldsymbol{\nu}}} \otimes \boldsymbol{\rho}\_{\boldsymbol{A}} \right\|\_{1}^{(H\_{\mathcal{S}} \otimes H\_{\boldsymbol{A}})} \\ &= \left\| \boldsymbol{e}\_{\boldsymbol{\nu}} - \boldsymbol{e}\_{\boldsymbol{\nu}\_{\boldsymbol{i}}} \right\|\_{1}^{(H\_{\mathcal{S}})} \\ &= \left\| \boldsymbol{\rho}\_{\boldsymbol{\nu}} - \boldsymbol{\alpha}\_{\boldsymbol{\nu}\_{\boldsymbol{i}}} \right\| \\ &= 2\sqrt{1 - |\langle \boldsymbol{\nu}, \boldsymbol{u}\_{\boldsymbol{i}} \rangle|^{2}} \\ &= \sqrt{2}, \end{aligned}$$

where ·(*H*) <sup>1</sup> denotes the trace norm relative to *<sup>H</sup>*. Now take <sup>υ</sup>*<sup>i</sup>* <sup>=</sup> <sup>υ</sup>(*s*) *<sup>i</sup>* as in Definition 11.3. Since ω*<sup>i</sup>* ≡ *u*(*e* υ(*s*) *i* ⊗ρ*A*)*u*<sup>∗</sup> ∈ *Si* by definition of a sound measurement, it follows from (11.7) and (11.10) that ω ≡ *u*(*e*<sup>υ</sup> ⊗ ρ*A*)*u*<sup>∗</sup> cannot lie in any subspace *Sk*, since that would require <sup>ω</sup> <sup>−</sup> <sup>ω</sup>*l* <sup>&</sup>gt; <sup>√</sup> 2 for all *l* = *k*, whereas (11.10) shows that this inequality fails for at least two values of *l*, viz. *l* = *i* and *l* = *j* = *i*. -

In order to circumvent Theorems 11.2 and 11.4, one should deny at least one of their explicit premises. Moreover, we note that postulate no. 3 (i.e. linearity of timeevolution) is always implicitly used in the form of the following counterfactual:

*If* ψ*<sup>n</sup> were* the initial state, then *for each n* it *would* evolve (linearly) according to the Schrodinger equation with ¨ *given* Hamiltonian *h*. *If* the initial state *were* ∑*<sup>n</sup> cn*ψ*n*, also then it *would* evolve according to the *same* Hamiltonian *h*.

This counterfactual should be added as a tacit assumption to all insolubility proofs (and also to informal statements of the measurement problem). As such, it may reasonably be denied (see §11.4), and such a denial puts assumption no. 4 in the *problem of statistics* in perspective, namely by denying the possibility that identical initial states can always be prepared in such a way that they evolve through exactly the same Hamiltonian. This leaves room for the following denials of some premise:


Current programs for solving the measurement problem neatly fall into this scheme:


Leaving most of these to the literature, we now turn to the instability approach (¬4).

#### 11.4 The Flea on Schrodinger's Cat ¨

The conclusion of this lengthy historical and technical introduction is that there are (at least) two different formulations of the measurement problem, whose insolubility is expressed by Theorems 11.2 and 11.4, respectively (leaving apart lavish opportunities for disagreement about the precise formulation of the underlying assumptions, and not even speaking about the outright dismissal of the whole issue as a *Scheinproblem*). Thus the problem in question is evidently of a different kind from say the famous open conjectures in mathematics (like the Riemann hypothesis), where it is clear what the theorem is that needs to be proved. Nonetheless, despite its undeniable philosophical aspects, we see the measurement problem as a genuine physics problem concerned with the discrepancy between (quantum) theory and experiment, to be addressed by mathematical, physical, and philosophical analysis.

Well aware that different people typically draw different lessons from history, we will now, in the interest of motivating our approach to follow, draw our own (necessarily subjective) conclusions from the history of the measurement problem.


Fig. 11.1 *The waves crashed between the towering cliff of Scylla and the jagged rocks of Charybdis*. Colour litograph by Gino D'Antonio. Reprinted with permission from Look and Learn Ltd.

On the other hand, both the Copenhagen Interpretation and the Swiss approach seem to have gone too far in the opposite direction: the former because it simply assumed (without providing any justification) that measurements have outcomes as soon as the apparatus is described classically, the latter in treating apparatuses as strictly infinite, and hence falling victim to Earman's Principle. The right approach, then, must be to define measurement as in the Copenhagen Interpretation, i.e. using a classical description of the apparatus whilst realizing it is ontologically a quantum system, and thusly navigate between Scylla (who treats measurement devices as arbitrary *quantum* systems) and Charybdis (who is too enthusiastic in taking infinite limits and hence in using a *classical* description).

3. Some kind of reality has to be attributed to the state of the system (though this reality cannot be "absolute", as in classical physics). In the algebraic approach to quantum theory adopted throughout the present book, the starting point is provided by the observables, relative to which states are defined. Since the doctrine of classical concepts drives us to switch between quantum-mechanical and classical descriptions, the reality of the quantum state is therefore *perspectival*. However, their perspectival nature does not make states less real; they say everything there is to say (at least by quantum theory) about some given level of description (which may be said to be chosen by the observer, and hence is intersubjective).

Thus the measurement problem arises in the way Schrodinger (rather than von Neu- ¨ mann) described it, although a precise framework has to be added to his poetry.

A framework that is precise both conceptually and mathematically is offered by *asymptotic emergence*, which we already encountered in our discussion of SSB in the previous chapter (see especially its preamble). To repeat the main points, we speak of asymptotic emergence if the following three conditions are all satisfied:


The root of the measurement problem (and hence the relevance of asymptotic emergence), then, lies in Bohr's requirement that the outcomes of measurements on systems defined within L be recorded in (at the least the language of) H, so that, crucially, *measurement according to* L *is a notion external to* L (if only partly), in particular involving the relationship between L and H. None of the insolubility proofs of the measurement problem take this into account (although due to Butterfield's Principle these proofs remain relevant in a secondary way). The typical feature of H that would be emergent in the above sense if the measurement problem were unresolved is that every physical system subject to the theory H is ontologically in a pure state; in Schrodinger's words quoted in ¨ §11.1: in H, sharply focused photographs of states are always possible (and hence any uncertainty or chance is due to ignorance, as in classical physics). Now, whatever the ontological nature of states in L, the states they induce in H should be real in the above sense, i.e., pure. But this is precisely what does *not* seem to be the case in typical measurement situations (e.g., Schrodinger's Cat), where the post-measurement state on ¨ L induces a *mixed* state on H. Just as in the case of SSB, this violates Butterfield's Principle, which in the case at hand states that since H is an idealization of L, any physical effect in H must be foreshadowed in L: *as* L *approaches* H, sharp measurement outcomes (defined as pure states in H) must arise from at least approximate single measurement outcomes (i.e. "singly-peaked wave-functions") in *the relevant asymptotic regime of* L (since only these wave-functions gives rise to pure classical states on H).

As noted before in the setting of SSB: violating Butterfield's Principle means violating Earman's Principle, which in turn leads to a violation of the link between theory and reality. It is worth spelling this out for the measurement problem:


It may now seem that invoking Butterfield's Principle has reduced the measurement problem to the usual one(s) described in the preceding sections. But look at the small print: in the Copenhagen Interpretation, single measurement outcomes only appear in some limiting "classical" regime of quantum mechanics.

"Deep inside" quantum mechanics, there is no need at all for the typical superposition ∑*<sup>n</sup> cn*ψ*<sup>n</sup>* to collapse into one of the states ψ*<sup>n</sup>* (unless one conflates the physical measurement problem with the philosophical problem of value indefiniteness). The external and asymptotic nature of measurement outcomes *causes* the measurement problem, but, as we shall see, at the same time it provides the key for its *solution*, since the collapse mechanism we propose is only effective asymptotically (so that it operates where it should and does not act where it should not). More precisely, by taking into account perturbations of the Hamiltonian that are tiny and ineffective in the quantum regime, but become hugely destabilizing in the classical regime (even before the actual limit), the wave-function of the apparatus will collapse.

Summarizing the preceding discussion, "our" measurement problem states that:

• *Certain* pure *post-measurement states of an (ontologically quantum-mechanical!) apparatus coupled to a microscopic quantum object induce* mixed *states on the apparatus (and on the composite)* once the apparatus is described classically*.*

This is a precise version of Schrodinger's Cat problem (rather than von Neumann's ¨ purely quantum-mechanical measurement problem), making it clear that at heart the problem does not lie with the (dis)appearance of interference terms (which is a red herring) but with the inability of quantum mechanics to predict single outcomes.

We now show by means of a simple example what it means to describe an ontologically quantum-mechanical apparatus classically, and outline the scenario we envisage for the solution of the measurement problem on the basis of this example. The *Spehner–Haake model* of the apparatus described below is too simple to be realistic, but nonetheless it may serve its purpose (as Bohr would say). The model involves a double-well potential like (10.11), modified however by a little basin in the middle, as shown below (including ground states for one large and one small value of *h*¯). Also here, SSB will play a crucial role, so please recall §10.1.

Fig. 11.2 Double-well potential with basin; ground state ψ(0) *<sup>h</sup>*¯=0.<sup>5</sup> and <sup>ψ</sup>(0) *h*¯=0.01.

Consider *N* ≡ *N* + 1 non-interacting particles, each with mass *m*, moving on the real line under the influence of a one-particle potential *V* (note that although the zero'th particle with be handled lightly differently from the others, it is not the pointer!). In terms of the canonical coordinates (p ,q )=(*p*0,..., *pN*,*q*0,...,*qN*) ∈ R2*<sup>N</sup>* on the phase space *X* = *T*∗R*<sup>N</sup>* the classical Hamiltonian is

$$h(\mathbf{p}', \mathbf{q}') = \sum\_{n'=0}^{N} \left( \frac{p\_{n'}^2}{2m} + V(q\_{n'}) \right). \tag{11.11}$$

Now perform a canonical transformation to center of mass and relative coordinates

$$P = \sum\_{n'=0}^{N} p\_{n'} \qquad \qquad \qquad \mathcal{Q} = \frac{1}{N'} \sum\_{n'=0}^{N} q\_{n'}; \tag{11.12}$$

$$\pi\_n = \sqrt{N'}p\_n - \frac{1}{\sqrt{N'}}\sum\_{n'=0}^{N} p\_{n'} \qquad \rho\_n = \frac{1}{\sqrt{N'}}(q\_n - q\_0) \ (n = 1, \dots, N); \tag{11.13}$$

the center of mass (*P*,*Q*) will be the pointer. The inverse transformation is given by

$$p\_0 = \frac{P}{N'} - \frac{1}{\sqrt{N'}} \sum\_{n=1}^{N} \pi\_n;\tag{11.14}$$

$$p\_n = \frac{P}{N'} + \frac{1}{\sqrt{N'}} \pi\_n;\tag{11.15}$$

$$q\_0 = \mathcal{Q} - \frac{1}{\sqrt{N'}} \sum\_{n=1}^{N} \rho\_n;\tag{11.16}$$

$$q\_n = \mathcal{Q} + \sqrt{N'} \rho\_n - \frac{1}{\sqrt{N'}} \sum\_{k=1}^{N} \rho\_k. \tag{11.17}$$

Granted that {*pn*,*qk*} = δ*<sup>n</sup><sup>k</sup>* , {*pn*, *pk*} = 0, and {*qn*,*qk*} = 0, we then duly have {*P*,*Q*} = 1 and {π*n*,ρ*k*} = δ*nk*, with all other elementary Poisson brackets vanishing.

In terms of the new coordinates, the classical Hamiltonian (11.11) reads

$$h(P, \mathcal{Q}, \pi, \mathfrak{p}) = h\_{\mathcal{A}}(P, \mathcal{Q}) + h\_{\mathcal{A}E}(\mathcal{Q}, \mathfrak{p}) + h\_{E}(\pi),\tag{11.18}$$

where π = (π1,...,π*N*), ρ = (ρ1,...,ρ*N*), and the three partial Hamiltonians are

$$h\_A(P, \mathcal{Q}) = \frac{P^2}{2M} + N^\prime V(\mathcal{Q});\tag{11.19}$$

$$h\_E(\boldsymbol{\pi}) = \frac{1}{2M} \left( \sum\_{n=1}^{N} \pi\_n^2 + \left( \sum\_{n=1}^{N} \pi\_n \right)^2 \right);\tag{11.20}$$

$$h\_{AE}(\mathcal{Q}, \boldsymbol{\rho}) = \sum\_{k=1}^{\infty} \frac{1}{k!} f\_k(\boldsymbol{\rho}) V^{(k)}(\mathcal{Q}),\tag{11.21}$$

where *M* = *Nm* is the total mass of the system, for simplicity we assumed *V* to be analytic (it will even be taken to be polynomial), and we abbreviated

$$f\_k(\boldsymbol{\rho}) = \left(-\frac{1}{\sqrt{N'}} \sum\_{l=1}^N \boldsymbol{\rho}\_l\right)^k + \sum\_{n=1}^N \left(\sqrt{N'}\boldsymbol{\rho}\_n - \frac{1}{\sqrt{N'}} \sum\_{l=1}^N \boldsymbol{\rho}\_l\right)^k. \tag{11.22}$$

Note that *f*1(ρ) = 0, so that to lowest order (i.e. *k* = 2) we have

$$h\_{AE}(\mathcal{Q}, \boldsymbol{\rho}) = \left(\,\_{\mathbb{Z}}^{1}N \sum\_{n=1}^{N} \rho\_{n}^{2} - \sum\_{k \neq l}^{N} \rho\_{k} \rho\_{l}\right) V''(\mathcal{Q}) + \cdots \tag{11.23}$$

We pass to the corresponding quantum-mechanical Hamiltonians in the usual way, and couple a two-level quantum system to the apparatus through the Hamiltonian

$$h\_{\rm SA} = \mu \cdot \sigma\_{\rm 3} \otimes P,\tag{11.24}$$

where the object observable *s* = σ3, acting on *HS* = C2, is to be measured. The idea is that *hA* is the Hamiltonian of a pointer that registers outcomes by localization on the real line, *hE* is the (free) Hamiltonian of the "environment", realized as the internal degrees of the freedom of the total apparatus that are not used in recording the outcome of the measurement, and *hAE* describes the pointer-environment interaction. The classical description of the apparatus then involves two approximations:


The measurement of*s*is now expected to unfold according to the following scenario:


Thus the classical description of the apparatus is at the same time the root of the measurement problem and the key to its solution: it creates the problem because at first sight a Schrodinger Cat state has the wrong classical limit (namely a mixture), ¨ but it also solves it, because precisely in the classical limit Cat states are destabilized even by the tiniest (asymmetric) perturbations and collapse to the "right" states.

The "flea" perturbation might itself be a genuine random process, perhaps ultimately of quantum origin. In that case, the measurement merely amplifies the randomness that was already inherent in the flea by transferring it to the apparatus.

Alternatively, the flea might be fundamentally deterministic (though it may nonetheless be modeled stochastically for pragmatic reasons). In principle, this would open the door to a restoration of determinism: for the flea now transfers its *determinism* (rather than its *randomness*) to the apparatus. The mistaken impression that quantum theory implies the irreducible randomness of nature then arises because although measurement outcomes are determined, they are unpredictable "for all practical purposes", even in a way that (because of the exponential sensitivity to the flea in 1/*h*¯ or *N*) dwarfs the unpredictability of classical chaotic systems.

Either way, the flea perturbation would naturally be different at each different run of an experiment under otherwise identical initial conditions, which motivates our critique of the counterfactual discussed after the proof of Theorem 11.4.

The location of the flea plays a similar role to the position variable in Bohmian mechanics, i.e., it is essentially a hidden variable. Recall the notions of *Outcome Independence* (OI) and *Parameter Independence* (PI), reviewed in §6.5. Briefly, the conjunction of OI and PI is equivalent to Bell's locality condition, and if the latter is satisfied, then the Bell inequalities hold. Since these are violated by quantum mechanics, any hidden variable theory compatible with quantum mechanics must violate OI or PI. Deterministic hidden variable theories necessarily satisfy OI, in which case Bell's Theorem or the Free Will Theorem shows that they must violate PI in order to be compatible with quantum mechanics. A violation of PI leads to possible superluminal signaling only if the hidden variable *z* can be controlled. If the wave-function ψ is regarded as the hidden variable, then quantum theory itself satisfies PI but violates OI (since ψ *can* be prepared, the other way round would be disastrous). *Qua* deterministic hidden variable theory, Bohmian mechanics satisfies OI, and hence it violates PI; for the GRW collpase theory it is the other way round.

The fate of the flea therefore depends on the nature of the perturbation: if it is deterministic, the theory behaves like Bohmian mechanics in this respect and hence violates PI, whereas stochastic perturbations typically violate OI (and possibly also PI). Either way, no conflict with the said theorems arises. Moreover, in the Colbeck– Renner Theorem, assumption CP fails for the flea scenario—assuming, in view of its limitation to finite-dimensional Hilbert spaces, the theorem is applicable at all!

Besides such issues, others remain to be resolved, of which we just mention two:


#### Notes

## §11.1. The rise of orthodoxy

The literature on the measurement problem is vast. Apart from the annotated reprint volume Wheeler & Zurek (1983), relatively recent surveys of and books include Bell (1990b), Maudlin (1995), Busch, Lahti, & Mittelstaedt (1996), Bassi & Ghirardi (2003), Mittelstaedt (2004), Wallace (2012), Allahverdyan, Balian, & Nieuwenhuizen (2013), and Busch, Lahti, Pellonpa¨a, & Ylinen, (2016). In modal ¨ interpretations of quantum mechanics, the measurement problem is (dubiously) conflated with the far milder problem of value indefiniteness, see e.g. Bub (1997).

## §11.2. The rise of modernity: Swiss approach and Decoherence

The Swiss approach to the measurement problem was initiated by Jauch (1964), to be continued by e.g. Hepp (1972), Emch & Whitten-Wolfe (1976), and recently also by Hepp's former student Frohlich; see e.g. Fr ¨ ohlich & Schubnel (2013) and ¨ Blanchard, Frohlich & Schubnel (2016). In addition, see Landsman (1991, 1995)— ¨ now seen as naive—, Breuer, Amann & Landsman (1993), and Sewell (2005).

Key early papers on decoherence were Zeh (1970), Zurek (1981), and Joos & Zeh (1985), and standard reviews are Zurek (2003), Joos et al (2003), and Schlosshauer (2007). Penetrating critiques include Janssen (2008) and Tanona (2013). See also Camilleri (2009a) and Freire (2009) for some history.

A defence of QBism may be found in Caves, Fuchs, & Schack (2002b).

## §11.3. Insolubility theorems

Insolubility theorems of the first kind kind go back to von Neumann (1932) and, in his wake, Wigner (1963) and Fine (1970). Theorem 11.2 is (in even more general form) due to Busch & Shimony (1996); with slightly different assumptions, the special case proved in the main text is due to Brown (1986). The monographs by Busch, Lahti, & Mittelstaedt (1996) and Mittelstaedt (2004) contain detailed discussions of theorems of this kind. See also Bacciagaluppi (2014).

The formulation of the problem of statistics and the problem of outcomes is taken from Maudlin (1995). Theorem 11.4 is due to Bassi & Ghirardi (2003), although here it is presented in a form inspired by Grubl (2003). ¨

For Bohmian mechanics see e.g. Goldstein (2013) and Bricmont (2016). A recent review of the GRW program and related dynamical collapse theories is Bassi et al (2013). Nowadays, the *locus classicus* for Many Worlds is Wallace (2012).

The time-evolution counterfactual discussed in the main text was inspired by the problem of free will, see the quotation of Dennett at the beginning of §6.3.

#### S11.4. The Flea on Schrodinger's Cat ¨

The approach to the measurement problem discussed here has its roots in Landsman & Reuvers (2013) and Landsman (2013), whose model at the time only involved the apparatus. This was criticized in van Heugten & Wolters (2016), many of whose points may be addressed by turning to the Spehner–Haake model, introduced by Spehner & Haake (2008). The ABN-model of Allahverdyan, Balian, & Nieuwenhuizen (2013) gives a similar picture; for a comparison see Spehner (2009).

## Chapter 12 Topos theory and quantum logic

The topos-theoretic approach to quantum mechanics (also known as *quantum toposophy*) has the same origin as the quantum logic programme initiated by Birkhoff and von Neumann, namely the feeling that classical logic is inappropriate for quantum theory and needs to be replaced by something else. For example, Schrodinger's Cat serves as an "intuition pump" for this feeling (at least in the naive ¨ view—dispensed with in Chapter 11—that it is neither alive nor dead). However, we feel that the quantum logic proposed by Birkhoff and von Neumann is:


Thus it would be preferable to have a quantum logic with exactly the opposite features, i.e., one that is distributive but drops the law of excluded middle: this suggest the use of *intuitionistic logic*. It is interesting to note that Birkhoff and von Neumann (who had earlier corresponded with Brouwer about possible intuitionistic aspects of game theory, notably chess) actually considered intuitionistic logic, but rejected it:

'The models for propositional calculi which have been considered in the preceding sections are also interesting from the standpoint of pure logic. Their nature is determined by quasiphysical and technical reasoning, different from the introspective and philosophical considerations which have had to guide logicians hitherto. Hence it is interesting to compare the modifications which they introduce into Boolean algebra, with those which logicians on "intuitionist" and related grounds have tried introducing. The main difference seems to be that whereas logicians have usually assumed that properties L71–L73 [i.e. (*a* ) = *a*, *a*∩*a* = ⊥, *a*∪*a* = #, and *a* ⊂ *b* implies *a* ⊃ *b* ] of negation were the ones least able to withstand a critical analysis, the study of mechanics points to the *distributive identities*as the weakest link in the algebra of logic. (. . . ) Our conclusion agrees perhaps more with those critiques of logic, which find most objectionable the assumption that *a* ∪*b* = # implies *a* ⊂ *b* (or, dually, the assumption that *a* ∩ *b* = ⊥ implies *b* ⊃ *a*—the assumption that to deduce an absurdity from the conjunction of *a* and not *b*, justifies one in inferring that *a* implies *b*).' (Birkhoff & von Neumann, 1936, p. 837).

As already made clear, then, our view is exactly the opposite. It is perhaps more striking that our position on (quantum) logic also differs from Bohr's:

'All departures from common language and ordinary logic are entirely avoided by reserving the word "phenomenon" solely for reference to unambiguously communicable information, in the account of which the word "measurement" is used in its plain meaning of standardized comparison.' (Bohr, 1996, p. 393)

Rather than *postulate* the logical structure of quantum mechanics, our goal is to *derive* it from our Bohrification ideology, more specifically, from the poset C (*A*) of all unital commutative C\*-subalgebras of a unital C\*-algebra *A*, ordered by inclusion. One may think of this poset as a mathematical home for Bohr's notion of *Complementarity*, in that each *C* ∈ C (*A*) represents some classical or experimental context, which has been decoupled from the others, *except for the inclusion relations, which relate* compatible *experiments* (in general there seem to be no preferred *pairs* of complementary subalgebras *C*,*C* ∈ C (*A*) that jointly generate *A*, although Bohr typically seems to have had such pairs in mind, e.g. position and momentum).

Quantum toposophy also accommodates the feeling that quantum mechanics is so radical that not just the actors of classical mechanics, but its whole stage must be replaced. This need is well expressed by the following quotation from Grothendieck, who created topos theory (but never witnessed its application to quantum theory):

'Passer de la mecanique de Newton ´ a celle d'Einstein doit ` etre un peu, pour le math ˆ ematicien, ´ comme de passer du bon vieux dialecte provenc¸al a l'argot parisien dernier cri. Par contre, ` passer a la m ` ecanique quantique, j'imagine, c'est passer du franc ´ ¸ais au chinois.' (Grothendieck, 1986, p. 61).1

Indeed, topos theory replaces even set theory, seen as the stage of classical mathematics and physics, by some other stage: each topos provides a "universe of discourse" in which to do mathematics. One major difference with set theory, then, is that logic in most toposes (including the ones we will use) is . . . intuitionistic!

This chapter presupposes familiarity with §C.11 on the logical side of the Gelfand isomorphism for commutative C\*-algebras, Appendix D on lattice theory and logic, and Appendix E on topos theory. Since this material is off the beaten track, as in Chapter 6 it may be helpful to provide a very brief guided tour through this chapter.

In §12.1 we first define the "quantum mechanical" topos T(*A*) that will act as the mathematical stage for the remainder of the chapter; it depends some given (unital) C\*-algebra *A* only via the poset C (*A*). We then define C\*-algebras internal to any topos T (in which the natural numbers and hence the rationals can be defined), which notion we then apply to T = T(*A*), so as to define an internal C\* algebra *A*, which turns out to be *commutative*. Following an interlude on constructive Gelfand spectra in §12.2, in §12.3 we then compute the internal Gelfand spectrum of *A* for *A* = *Mn*(C), and derive our intuitionistic logic of quantum mechanics from this, given by eqs. (12.95) - (12.96) and (12.103) - (12.107). We also discuss its (Kripke) semantics. In §12.4 we generalize these computations to arbitrary (unital) C\*-algebras *A*, culminating in Corollary 12.22. Finally, in §12.5 we relate this material to both the Kochen–Specker Theorem (which provided the original motivation for quantum toposophy), as well as to an attempt at ontology called "Daseinisation."

<sup>1</sup> 'For a mathematician, switching from Newton's mechanics to Einstein's must to some extent be like switching from a good old provincial dialect to Paris slang. In contrast, I imagine that switching to quantum mechanics amounts to switching to Chinese.' Translation by the author.

#### 12.1 C\*-algebras in a topos

Let *A* be a unital C\*-algebra (in Sets), with associated poset C (*A*) of all unital commutative C\*-subalgebras*C* ⊂ *A* ordered by inclusion. Regarding C (*A*) as a (posetal) category, in which there is a unique arrow *C* → *D* iff *C* ⊆ *D* and there are no other arrows, we obtain the topos T(*A*) of functors *F* : C (*A*) → Sets (*F* underlined!), i.e.,

$$\mathsf{T}(A) = [\ell^{\ell}(A), \mathsf{Sets}].\tag{12.1}$$

Since for any poset *X* we have an isomorphism of categories [*X*,Sets] ) Sh(*X*), where *X* is endowed with the Alexandrov topology, see (E.84), we may alternatively write

$$\mathsf{T}(A) \simeq \mathsf{Sh}(\ell^{\varphi}(A)).\tag{12.2}$$

This alternative description will turn out to be very useful in computing the Gelfand spectrum of the internal commutative C\*-algebra *A* to be defined shortly. Since we occasionally switch between T(*A*) and the topos Sets, we underline objects (i.e., functors *F* : C (*A*) → Sets) of the former. In order to do some kind of Analysis in T(*A*), we need real numbers. In many toposes this is a tricky concept, but:

Proposition 12.1. *In* T(*A*)*, the* Dedekind reals *are given by the constant functor*

$$
\mathbb{R}\_0: \mathbf{C} \mapsto \mathbb{R}, \tag{12.3}
$$

*where C* ∈ C (*A*)*, with associated frame given by the functor*

$$\mathcal{O}(\underline{\mathbb{R}})\_0 : \mathcal{C} \mapsto \mathcal{O}((\uparrow \mathcal{C}) \times \mathbb{R}). \tag{12.4}$$

Similarly, we have complex numbers C and their frame O(C) in T(*A*).

*Proof.* In a general sheaf topos Sh(*X*), the Dedekind real numbers object is the sheaf (E.150), with frame (E.149). The point now is that each continuous function *f* ∈ *C*(C (*A*),R) on *X* = C (*A*) with the Alexandrov topology is locally constant.

To see this, suppose *C* ≤ *D* in *U*, and take *V* ⊆ R open with *f*(*C*) ∈ *V*. Then *<sup>C</sup>* <sup>∈</sup> *<sup>f</sup>* <sup>−</sup>1(*V*) and *<sup>f</sup>* <sup>−</sup>1(*V*) is open by continuity of *<sup>f</sup>* . But the smallest open set containing *C* is ↑*C*, which contains *D*, so that *f*(*D*) ∈ *V*. Taking *V* = (*f*(*C*) − ε,∞) gives the inequality *f*(*D*) > *f*(*C*)−ε for all ε > 0, whence *f*(*D*) *f*(*C*), whereas *V* = (−∞, *f*(*C*) +ε) yields *f*(*D*) ≤ *f*(*C*). Hence *f*(*C*) = *f*(*D*).

Thus we obtain (12.3) - (12.4) as special cases of (E.150) - (E.149). -

Other objects of interest in T(*A*) that we will steadily use are:


$$
\underline{\mathfrak{Q}}\_0(\mathcal{C}) = \text{Upper}(\mathcal{C});\tag{12.5}
$$

$$\underline{\mathfrak{Q}}\_1(\mathbb{C}\subseteq D) = (-) \cap (\uparrow D),\tag{12.6}$$

where Upper(*C*) is the set of all upper sets above *C* (i.e., *S* ∈ Upper(*C*) iff *S* ⊂ C (*A*) such that: (i) *C* ⊆ *D* for each *D* ∈ *S*, and (ii) *D* ∈ *S* and *D* ⊆ *E* imply *E* ∈ *S*).

• The *subobject classifiert* : 1 → Ω, which is a natural transformation whose components *tC* are given, according to (E.88), as

$$t\_C(\*) = \uparrow \mathcal{C},\tag{12.7}$$

i.e., the set of all *D* ⊇ *C* in C (*A*); this is the maximal element of Upper(*C*).

Furthermore, exponentials in T(*A*) have the following straightforward description:

$$\underline{F}\_0^{\underline{G}}(\mathcal{C}) = \text{Nat}(\underline{G}\_{\uparrow C}, \underline{F}\_{\uparrow C}) \text{ ( $\mathcal{C} \in \mathcal{C}$  ( $A$ )),}\tag{12.8}$$

where *F*↑*<sup>C</sup>* is the restriction of the functor *F* : C (*A*) → Sets to ↑ *C* ⊆ C (*A*), and Nat(−,−) denotes the set of natural transformations between the functors in question. In particular, since C· 1 is the bottom element of the poset C (*A*), one has

$$\underline{F}^{\underline{G}}(\mathbb{C}\cdot 1) = \text{Nat}(\underline{G}, \underline{F}).\tag{12.9}$$

One way to derive (12.8) is to start from general sheaf toposes Sh(*X*), where

$$F\_0^G(U) = \text{Nat}(G\_{|U}, F\_{|U}),\tag{12.10}$$

both restricted to O(*U*) (i.e. defined on each open *V* ⊆ *U* instead of all *V* ∈ O(*X*)), and use (E.84). Combining these observations, one has

$$\underline{\mathfrak{Q}}^{\underline{F}}(\mathcal{C}) \cong \mathrm{Sub}(\underline{F}\_{\uparrow \mathcal{C}}),\tag{12.11}$$

i.e., the set of subfunctors of *F*↑*C*. In particular, like in (12.9), we find

$$\mathfrak{Q}^{\underline{F}}(\mathbb{C}\cdot 1) \cong \operatorname{Hom}(\underline{F}, \mathfrak{Q}) \cong \operatorname{Sub}(\underline{F}),\tag{12.12}$$

the set of subfunctors of *F* itself. Recall that, as explained after Lemma E.16, a subfunctor *Z* ∈ Sub(*F*) is a functor *Z* : ↑C (*A*) → Sets for which *Z*0(*C*) ⊆ *F*0(*C*) for all *C* ∈ C (*A*) and *Z*<sup>1</sup> is the restriction of *F*1. If *C* ⊆ *D*, then the set-theoretic map <sup>Ω</sup>*F*(*C*) <sup>→</sup> <sup>Ω</sup>*F*(*D*) defined by <sup>Ω</sup>*F*, identified with a map Sub(*F*↑*C*) <sup>→</sup> Sub(*F*↑*D*), is simply given by restricting a given subfunctor of *F*↑*<sup>C</sup>* to ↑*D*.

Using either the internal language of a topos (see §E.5) or direct object-arrow constructions, one can copy standard definitions in set theory so as to define mathematical objects "internal" to any given topos, *as long as these definitions make sense in first-order intuitionistic logic* (which roughly speaking means that they are "constructive", in not using the axiom of choice or the law of the excluded middle).

As a case in point, let us now define *internal C\*-algebras* in T(*A*) (this may be done even more generally in any topos T in which at least the natural numbers N, and hence the rationals Q, are defined). Vector spaces (over R or C) and (commutative) \*-algebras may be defined in T(*A*) through straightforward object-arrow translations of the usual constructions in Sets, i.e., one has an object *A* and arrows:

$$\cdot : \underline{\mathbb{C}} \times \underline{\mathbb{A}} \to \underline{\mathbb{A}} \quad (\text{scalar multiplication}); \tag{12.13}$$

$$+\colon \underline{\mathbf{A}} \times \underline{\mathbf{A}} \to \underline{\mathbf{A}} \quad \text{(addition)};\tag{12.14}$$

$$\times : \underline{\mathbf{A}} \times \underline{\mathbf{A}} \to \underline{\mathbf{A}} \quad (\text{multiplication});\tag{12.15}$$

$$\* : \underline{A} \to \underline{A} \quad (\text{involution}), \tag{12.16}$$

subject to the usual axioms. Syntactically, a *unit (internal)* in *A* is a constant

$$1\_{\underline{A}} : \underline{1} \to \underline{A},$$

with 1 the terminal object in T(*A*), such that

$$\left(\underline{\mathbf{A}} \stackrel{\cong}{\longrightarrow} \mathbf{1} \times \underline{\mathbf{A}} \stackrel{(\mathbf{1}\_{\underline{\mathbf{A}}} \cdot \operatorname{id}\_{\underline{\mathbf{A}}})}{\longrightarrow} \underline{\mathbf{A}} \times \underline{\mathbf{A}} \stackrel{\times}{\to} \underline{\mathbf{A}}\right) = \left(\underline{\mathbf{A}} \stackrel{\operatorname{id}\_{\underline{\mathbf{A}}}}{\longrightarrow} \underline{\mathbf{A}}\right). \tag{12.17}$$

The notions of norm and completeness are less easily defined internally, and hence one starts reinterpreting the notion of a *seminorm* in Sets as a subset

$$N \subset A \times \mathbb{Q}^+,\tag{12.18}$$

for which

$$(a,q)\in\underline{N}\text{ iff }||a||$$

In our topos <sup>T</sup>(*A*), we interpret *<sup>N</sup>* <sup>⊂</sup> *<sup>A</sup>*×Q<sup>+</sup> as a subfunctor *<sup>N</sup>* <sup>→</sup> *<sup>A</sup>*×Q<sup>+</sup> (or, equivalently by <sup>λ</sup>-conversion (E.153), as an arrow <sup>1</sup> <sup>→</sup> <sup>Ω</sup>*A*×Q<sup>+</sup> ), subject to the axioms:

$$\forall\_p p > 0 \to (0, p) \in \underline{N};\tag{12.20}$$

$$
\exists\_q q > 0 \land (a, q) \in \underline{N}; \tag{12.21}
$$

$$\forall a \forall\_p (a, p) \in \underline{N} \to (a^\*, p) \in \underline{N};\tag{12.22}$$

$$\forall\_a \forall\_q \left( (a, q) \in \underline{N} \leftrightarrow \exists\_p p < q \land (a, p) \in \underline{N} \right);\tag{12.23}$$

$$\forall a \forall\_p \left( (a, p) \in \underline{N} \land (b, q) \in \underline{N} \to (a + b, p + q) \in \underline{N} \right);\tag{12.24}$$

$$\forall a \forall\_p \left( (a, p) \in \underline{N} \land (b, q) \in \underline{N} \to (a \cdot b, p \cdot q) \in \underline{N} \right);\tag{12.25}$$

$$\forall a \forall\_p \forall\_\varepsilon ((a, p) \in \underline{N} \land (|z| < q) \to (z \cdot a, p \cdot q) \in \underline{N}).\tag{12.26}$$

Here *a*,*b* are variables of type *A*, *p* and *q* are variables of type Q, *z* is a variable of type C, 0 is the zero constant in *A*, etc. For a unital \*-algebra (whose internal definition we leave to the reader), with unit denoted by 1*<sup>A</sup>* as usual, we also require

$$\Vdash \forall\_a \forall\_p p > 1 \to (1\_A, p) \in \underline{N}. \tag{12.27}$$

If the seminorm relation furthermore satisfies

$$(a^\* \cdot a, q^2) \in \underline{N} \leftrightarrow (a, q) \in \underline{N} \tag{12.28}$$

for all *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>* and *<sup>q</sup>* <sup>∈</sup> <sup>Q</sup>+, then *<sup>A</sup>* is said to be a *pre-semi-C\*-algebra*.

To proceed to a C\*-algebra, one requires *a* = 0 whenever (*a*,*q*) ∈ *N* for all *q* in Q+, making the seminorm into a norm, and subsequently this normed space should be complete. The latter condition is quite complicated, since in a topos one has no Cauchy sequences in the usual sense, because *A* may not have global elements (in the sense of arrows 1 → *A*). Indeed, our algebra *A* defined below only has trivial global elements, namely multiples of the the unit operator.

Hence one needs a generalization of Cauchy sequences in the general spirit of topos theory, where global elements are replaced by general elements.

Definition 12.2. *With* N *the natural numbers object in* T(*A*) *(which is simply the constant functor C* → <sup>N</sup>*), a* Cauchy approximation *in A is an arrow s* : <sup>N</sup> <sup>→</sup> <sup>Ω</sup>*<sup>A</sup> (or, equivalently, by* λ*-conversion* (E.153)*, an arrow* χ : N×*A* → Ω*, which in turn is the same as a subobject S of* N×*A) such that:*

$$
\forall\_n \exists\_a a \in s\_n;
\tag{12.29}$$

$$\forall\_k \exists\_m \forall\_n \forall\_{n'} (n > m, n' > m, a \in \text{s}\_n, a' \in \text{s}\_{n'}) \to (a - a', 1/k) \in \underline{N}. \tag{12.30}$$

*Here (for brevity) the first three comma's (but not the last!) stand for* ∧*, and a* ∈ *sn denotes* (*n*,*a*) ∈ *S, where S is the above subobject of* N×*A classified by* χ *(we use the notation explained in item 9 at the end of* §*E.5, where the variable x* : *X is now the pair* (*n*,*a*) *of type* N×*A). Moreover, a Cauchy approximation* converges *to b if:*

$$(\forall\_k \exists\_m \forall\_n \left( n > m, a \in \mathbf{s}\_n \right) \to (a - b, 1/k) \in \underline{N}, \tag{12.31}$$

*and we call A* complete *if each Cauchy approximation in A converges.*

*Finally, a* C\*-algebra in T(*A*) *(and similarly in any topos with natural numbers) is a complete pre-semi-C\*-algebra in which the semi-norm is a norm.*

Homomorphisms and isomorphisms between such (internal) C\*-algebras may be defined in the usual way, bijections in set theory being replaced by isomorphisms of objects. We only consider internal C\*-algebras with unit, so that we may define internal categories CA<sup>1</sup> (and CCA1) of (commutative) unital C\*-algebras in T(*A*) in the obvious way (where the homomorphisms are required to preserve the unit).

We now come to the basic construction that underlies "quantum toposophy".

Theorem 12.3. *Let A be a unital C\*-algebra. Define a functor A* ∈ T(*A*) *by*

$$
\underline{A} : \mathcal{C}(A) \to \mathbf{Sets};\tag{12.32}
$$

$$
\underline{A}\_0(\mathcal{C}) = \mathcal{C}; \tag{12.33}
$$

$$\underline{A}\_{1}(C \subseteq D) = (C \hookrightarrow D). \tag{12.34}$$

*Then A is an internal unital* commutative *C\*-algebra under pointwise operations.*

Here *A* is meant to be an "ordinary" unital C\*-algebra, i.e., defined in Sets. Note that the symbol *C* in (12.33) changes character from left to right: on the left-hand side it is a *point* in C (*A*), whereas on the right-hand side it is a *subset* of *A*. Nonetheless, one might describe *A* as the *tautological functor* in [C (*A*),Sets].

The pointwise operations in *A* are the obvious natural transformations that are ultimately defined by the corresponding operations in each commutative C\*-algebra *C*. For exampe, addition + : *A*×*A* → *A* is a natural transformation with components +*<sup>C</sup>* : *C* ×*C* → *C* defined in *C*, etc. Commutativity of *A* then trivially follows from commutativity of each *commutative* C\*-subalgebra *C*.

As already mentioned, the unit 1*<sup>A</sup>* is syntactically a constant 1*<sup>A</sup>* : 1 → *A*, whose components (1*A*)*<sup>C</sup>* : ∗ → *C* are just the units 1*<sup>C</sup>* in each *C* (recall that elements of our poset C (*A*) were defined as *unital* commutative C\*-subalgebras of *A*!).

Finally, we regard the (semi) norm *<sup>N</sup>* as a subobject of *<sup>A</sup>* <sup>×</sup> <sup>R</sup><sup>+</sup> (or *<sup>A</sup>* <sup>×</sup> <sup>Q</sup>+), hence as a natural transformation, with components *NC* <sup>⊂</sup> *<sup>C</sup>* <sup>×</sup>R<sup>+</sup> defined by

$$(c, q) \in \underline{\mathbf{N}}\_{\mathcal{C}} \text{ iff } ||c|| < q,\tag{12.35}$$

where · is the norm in *C* (which of course is inherited from *A*).

*Proof.* The proof is a straightforward verification, expect perhaps for completeness. First, the above subobject *S* of N × *A*, realized as a subfunctor as usual, looks as follows: for each *C* ∈ C (*A*) we have a subset *SC* ⊂ N ×*C*, regarded as a sequence (*Cn*) of subsets of *C* through the identification (*n*, *c*) ∈ *SC* iff *c* ∈ *Cn*, such that *Cn* ⊂ *Dn* whenever *C* ⊂ *D*. Unfolding axiom (12.29) using the Kripke–Joyal semantics rules listed at the end of §E.5, we find that this axiom holds iff:

$$\forall\_{C \in \mathcal{C}(A)} \forall\_{n \in \mathbb{N}} \exists\_{c \in C} \forall\_{D \supseteq C} c \in D\_n,\tag{12.36}$$

which is satisfied iff each of the above subsets *Cn* ⊆ *C* is non-empty. By a similar analysis, axiom (12.30) is satisfied iff for each ε > 0 there is *m* ∈ N such that for all *n*,*n*,> *m* and all *c* ∈ *Cn*, *c* ∈ *Cn* one has *c*−*c* < ε in *C*. This simply means that any choice (*cn*) where *cn* ∈*Cn* is a Cauchy sequence in*C*. Accordingly, *A* is complete provided each such sequence converges, i.e., iff each *C* ∈ C (*A*) is complete. Since these *C*'s are C\*-subalgebras of *C*, this is simply true by construction. -

In a similar way, one easily proves the following generalization of Theorem 12.3:

Theorem 12.4. *Let* C *be a small category. Any internal C\*-algebra in the associated presheaf topos* [Cop,Sets] *is given by a contravariant functor A* : <sup>C</sup> <sup>→</sup> CA*, where* CA *is the category that has C\*-algebras as objects and homomorphims as arrows. Moreover, A is unital/commutative iff each C\*-algebra A*(*C*) *is unital/commutative.*

It should be mentioned that internal C\*-algebras on sheaf toposes T = Sh(*X*) are not covered by this theorem (except in the somewhat degenerate case we use, namely *X* = C (*A*) with the Alexandrov topology). As a case in point, we just mention the beautiful fact that internal C\*-algebras in Sh(*X*) correspond to continuous bundles of C\*-algebras over *X* (in Sets).

#### 12.2 The Gelfand spectrum in constructive mathematics

In this chapter we rely on a particular construction of the frame O(Σ(*A*)) (cf. §C.11) that can be generalized to topos theory (in which the Gelfand spectrum Σ(*A*) of an internal commutative C\*-algebra *A* is a locale). We start with some lattice lore.

Definition 12.5. *Let L be a* distributive *lattice with top* # *and bottom* ⊥*.*


$$\mathbf{x} = \mathbf{x} \land (\mathbf{y} \lor \mathbf{z}) = (\mathbf{x} \land \mathbf{y}) \lor (\mathbf{x} \land \mathbf{z}) = \mathbf{x} \land \mathbf{y} \le \mathbf{y}. \tag{12.37}$$

*4. An ideal I* ∈ Idl(*L*) *is* regular *if the condition I* ⊇ {*y* ∈ *L* | *y* + *x*} *implies x* ∈ *I. The poset of regular ideals in L, ordered by inclusion, is called* RIdl(*L*)*, i.e.,*

$$\text{RId}(L) = \{ I \in \text{Idl}(L) \mid (\forall\_{\mathbf{y} \in L} \mathbf{y} \lll \mathbf{x} \Rightarrow \mathbf{y} \in I) \Rightarrow \mathbf{x} \in I \}. \tag{12.38}$$

The posets D(*L*), Idl(*L*) and RIdl(*L*) are easily seen to be frames. Any ideal *I* ∈ Idl(*L*) can be *regularized*, i.e., turned into a regular ideal A (*I*), by means of the restriction to Idl(*L*) ⊂ D(*L*) of the "closure" map A : D(*L*) → D(*L*) defined by

$$\omega^{\prime}(I) = \{ \mathbf{x} \in L \mid \forall\_{\mathbf{y} \in L} \mathbf{y} \ll \mathbf{x} \Rightarrow \mathbf{y} \in I \}. \tag{12.39}$$

In terms of A , the canonical map *x* → ↓*x* from *L* to Idl(*L*) "regularizes" to a map

$$f: L \to \text{RId}(L);\tag{12.40}$$

$$
\mathfrak{x} \mapsto \mathfrak{x}'(\downarrow \mathfrak{x}).\tag{12.41}
$$

For *I* ∈ RIdl(*L*) we obviously have A (*I*) = *I*, and hence we may write

$$\text{RIdl}(L) = \{ I \in \text{Idl}(L) \mid \mathcal{A}'(I) = I \}. \tag{12.42}$$

Definition 12.6. *1. A frame* O(*X*) *with top element* # *is called* compact *if every subset S* <sup>⊂</sup> <sup>O</sup>(*X*) *with <sup>S</sup>* <sup>=</sup> # *has a finite subset F* <sup>⊂</sup> *S with <sup>F</sup>* <sup>=</sup> #*.*

*2. A frame* O(*X*) *is called* regular *if each V* ∈ O(*X*) *satisfies*

$$V = \bigvee \{ U \in \mathcal{O}(X) \mid U \ll V \}. \tag{12.43}$$

When O(*X*) is the topology of some space *X*, the frame O(*X*) is compact (regular) iff *X* is compact (regular) as a space. Furthermore, *X* is compact and Hausdorff iff it is compact and regular, and hence the Gelfand spectrum Σ(*A*) of a commutative unital C\*-algebra *A* will be a compact and regular frame; see Theorem 12.8 below.

Recall that the self-adjoint part *A*sa of any C\*-algebra *A* is partially ordered by putting *<sup>a</sup>* <sup>≤</sup> *<sup>b</sup>* iff *<sup>b</sup>*−*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*+, cf. §C.7. This partial order is, of course, inherited by the positive cone *<sup>A</sup>*<sup>+</sup> <sup>⊂</sup> *<sup>A</sup>*sa. If *<sup>A</sup>* is commutative, this partial ordering makes *<sup>A</sup>*sa <sup>a</sup> lattice; for example, if *A* = *C*(*X*) the lattice operations are *a*∨*b* = max{*a*,*b*} and *a* ∧ *b* = min{*a*,*b*} (taken pointwise). In general, one may then compute ∨ and ∧ from the Gelfand isomorphism *A* ∼= *C*(*X*), but they are intrinsically defined via ≤.

Let *<sup>A</sup>* be a commutative unital C\*-algebra. For *<sup>a</sup>*,*<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>*+, define *<sup>a</sup> <sup>b</sup>* iff there exists *n* ∈ N such that *a* ≤ *nb*. Define *a* ≈ *b* iff *a b* and *b a*. This is an equivalence relation. Moreover, ≈ is a *congruence*, that is, an equivalence relation ∼ on a lattice *L* that is compatible with ∧ and ∨ in the sense that *x* ∼ *y* and *x* ∼ *y* imply *x*∧*x* ∼ *y*∧*y* and *x*∨*x* ∼ *y*∨*y* . Given some congruence ∼ on *L*, one may define ∧ and ∨ on *L*/ ∼ by [*x*]∧[*y*]=[*x*∧*y*] and [*x*]∨[*y*]=[*x*∨*y*], respectively, so that the set-theoretic quotient *L*/ ∼ inherits the lattice structure of *L* and hence is a lattice in its own right.

This quotient construction by a congruence preserves distributivity, so that

$$L\_A = A^+ / \approx \text{ .}\tag{12.44}$$

is a distributive lattice. We will use the elements <sup>D</sup>*<sup>a</sup>* <sup>≡</sup> [*a*+] of *LA* (indexed by *<sup>a</sup>* <sup>∈</sup> *A*sa), where [*a*+] is the equivalence class in *LA* of the positive part *a*<sup>+</sup> in the canonical decomposition *<sup>a</sup>* <sup>=</sup> *<sup>a</sup>*<sup>+</sup> <sup>−</sup> *<sup>a</sup>*−, with *<sup>a</sup>*<sup>±</sup> <sup>≥</sup> 0 and *<sup>a</sup>*+*a*<sup>−</sup> <sup>=</sup> 0; lattice-theoretically, one has *a*<sup>+</sup> = *a*∨0 and *a*<sup>−</sup> = *a*∧0. This gives a lattice homomorphism *A*sa → *LA*, *a* → D*a*, whose restriction to *<sup>A</sup>*<sup>+</sup> is just the canonical projection *<sup>A</sup>*<sup>+</sup> <sup>→</sup> *LA*. These <sup>D</sup>*<sup>a</sup>* satisfy:

$$\mathsf{D}\_{\mathsf{I}} = \mathsf{T};\tag{12.45}$$

$$\mathbf{D}\_a \wedge \mathbf{D}\_{-a} = \bot;\tag{12.46}$$

$$\mathsf{D}\_{a} = \bot \; (a \le 0);\tag{12.47}$$

$$\mathsf{D}\_{a+b} \leqslant \mathsf{D}\_{a} \lor \mathsf{D}\_{b};\tag{12.48}$$

$$\mathsf{D}\_{a} \wedge \mathsf{D}\_{b} \leqslant \mathsf{D}\_{ab};\tag{12.49}$$

$$
\mathsf{D}\_{ab} \leqslant \mathsf{D}\_{a} \lor \mathsf{D}\_{-b}, \tag{12.50}
$$

where the inequalities may also be written as equalities, since *x* ≤ *y* iff *x* = *x* ∧ *y*. These relations are easy to check for *A* = *C*(*X*), and hence they are true for any *A*. The elements D*<sup>a</sup>* obviously exhaust *A*+, and eqs. (12.45) - (12.50) imply:

$$a \le b \implies \mathsf{D}\_a \leqslant \mathsf{D}\_b;\tag{12.51}$$

$$\mathbf{D}\_a = \mathbf{D}\_{a^+} ; \tag{12.52}$$

$$\mathsf{D}\_{na} = \mathsf{D}\_{a} \ (n \in \mathbb{N});\tag{12.53}$$

$$\mathsf{D}\_{ab} = (\mathsf{D}\_a \wedge \mathsf{D}\_b) \vee (\mathsf{D}\_a \wedge \mathsf{D}\_{-b}),\tag{12.54}$$

$$\mathsf{D}\_{a} \wedge \mathsf{D}\_{b} = \mathsf{D}\_{a \wedge b}. \tag{12.55}$$

For the Gelfand spectrum we need the frame RIdl(*LA*), and hence the relation +. Lemma 12.7. *For all* <sup>D</sup>*a*,D*<sup>b</sup>* <sup>∈</sup> *LA, we have (with both q* <sup>∈</sup> <sup>Q</sup><sup>+</sup> *and q* <sup>∈</sup> <sup>R</sup>+*):*

$$
\mathsf{D}\_b \ll \mathsf{D}\_a \text{ iff } \exists\_{q>0} \mathsf{D}\_b \lesssim \mathsf{D}\_{a-q}. \tag{12.56}
$$

*Proof.* From right to left, just choose D*<sup>c</sup>* = D*q*−*a*. Conversely, if *A* =*C*(*X*), it is easy to see that if there exists D*<sup>c</sup>* ∈ *LA* such that D*c*∨D*<sup>a</sup>* = # and D*c*∧D*<sup>b</sup>* = ⊥, then there exists *q* > 0 such that D*c*−*<sup>q</sup>* ∨D*a*−*<sup>q</sup>* = #. Hence D*<sup>c</sup>* ∨D*a*−*<sup>q</sup>* = #, so that

$$\mathsf{D}\_b = \mathsf{D}\_b \land (\mathsf{D}\_c \lor \mathsf{D}\_{a-q}) = \mathsf{D}\_b \land \mathsf{D}\_{a-q} \leqslant \mathsf{D}\_{a-q}.\tag{7}$$

Note that by construction the map *f* in (12.40) is given by

$$f(\mathsf{D}\_a) = \{ \mathsf{D}\_c \in L\_A \mid \forall\_{\mathsf{D}\_b \in L\_A} \mathsf{D}\_b \ll \mathsf{D}\_c \Rightarrow \mathsf{D}\_b \leqslant \mathsf{D}\_a \},\tag{12.57}$$

and, by Lemma 12.7, satisfies

$$f(\mathsf{D}\_a) \leqslant \bigvee \{ f(\mathsf{D}\_{a-q}) \mid q > 0 \}. \tag{12.58}$$

For later use, also note that (12.57) implies

$$f(\mathsf{D}\_a) = \top \iff \mathsf{D}\_a = \top. \tag{12.59}$$

Theorem 12.8. *The topology* O(Σ(*A*)) *of the Gelfand spectrum* Σ(*A*) *of a commutative unital C\*-algebra A is isomorphic to the frame of all regular ideals of LA:*

$$\mathcal{O}(\Sigma(A)) \cong \mathbf{R} \text{Id} \mathbf{l}(L\_A);\tag{12.60}$$

$$\{\mathfrak{o}\in\Sigma(A) \mid \mathfrak{o}(a) > 0\} \leftrightarrow \mathfrak{D}\_a,\tag{12.61}$$

*or, equivalently, for the opens*(*r*,*s*) ∈ O(R) *with ensuing opens a*ˆ <sup>−</sup>1(*r*,*s*)*in* O(Σ(*A*))*,*

$$\hat{a}^{-1}(r,s) \equiv \{ \mathfrak{o} \in \Sigma(A) \mid \mathfrak{o}(a) \in (r,s) \} \leftrightarrow f(\mathbb{D}\_{s-a} \wedge \mathbb{D}\_{a-r}) \text{ ( $r < s$ )}.\tag{12.62}$$

*Moreover, on this isomorphism,* O(Σ(*A*)) *is a compact regular frame.*

The proof of this theorem is unfortunately beyond our reach; instead, we now give an alternative descriptions of the frame RIdl(*LA*), which will be useful for computational purposes in topos theory. This again requires some more background in lattice theory. Let (*L*,) be a meet semilattice (i.e., a poset in which any pair of elements has an infimum; in most of our applications (*L*,) is actually a distributive lattice).

Definition 12.9. *A* covering relation *on L is a relation* ⊆ *L*×P(*L*)*—equivalently, a function L* → P(P(*L*))*—written x U when* (*x*,*U*) ∈ *, such that:*

$$\begin{array}{l} \text{1. If } \mathbf{x} \in U \text{ then } \mathbf{x} \preccurlyeq U. \\ \text{2. If } \mathbf{x} \preccurlyeq U \text{ and } U \preccurlyeq V \text{ (i.e., } \mathbf{y} \preccurlyeq V \text{ for all } \mathbf{y} \in U) \text{ then } \mathbf{x} \preccurlyeq V. \\ \text{3. If } \mathbf{x} \preccurlyeq U \text{ then } \mathbf{x} \land \mathbf{y} \preccurlyeq U. \\ \text{4. If } \mathbf{x} \in U \text{ and } \mathbf{x} \in V, \text{ then } \mathbf{x} \preccurlyeq U \land \mathbf{V} \text{ (where } U \land \mathbf{V} = \{\mathbf{x} \land \mathbf{y} \mid \mathbf{x} \in U, \mathbf{y} \in V\}). \end{array}$$

For example, if (*L*,)=(O(*X*),⊆) one may take *<sup>x</sup> <sup>U</sup>* iff *<sup>x</sup> U*, i.e., iff *<sup>U</sup>* covers *x*. Also here we have a closure operation A : D(*L*) → D(*L*), given by

$$\mathcal{A}\mathcal{A}U = \{\mathbf{x} \in L \mid \mathbf{x} \lhd U\}. \tag{12.63}$$

This operation has the following properties:

$$
\downarrow U \subseteq \mathcal{A}'U;\tag{12.64}
$$

$$U \subseteq \mathcal{a}V \Rightarrow \mathcal{a}\mathcal{U}U \subseteq \mathcal{a}V;\tag{12.65}$$

$$
\mathcal{A}\mathcal{A}U \cap \mathcal{A}V \subseteq \mathcal{A}'(\downarrow U \cap \downarrow V). \tag{12.66}
$$

The frame F(*L*, ) generated by such a structure is then defined by

$$\mathcal{A}\mathcal{P}(L,\lhd) = \{ U \in \mathcal{D}(L) \mid \mathcal{a}\mathcal{U}U = U \} = \{ U \in \mathcal{P}(L) \mid \mathbf{x} \lhd U \Rightarrow \mathbf{x} \in U \}; \quad (12.67)$$

the second equality follows because firstly the property A *U* = *U* guarantees that *U* ∈ D(*L*), and secondly one has A *U* = *U* iff *x U* implies *x* ∈ *U*. Defining

$$U \sim V \text{ iff } \ U \lhd V \text{ and } V \lhd U,\tag{12.68}$$

an equivalent description of the frame F(*L*, ) that is occasionally useful is

$$\mathcal{J}^{\mathfrak{P}}(L,\lnot\mathfrak{I}) \cong \mathcal{J}^{\mathfrak{P}}(L)/\sim. \tag{12.69}$$

Indeed, the map *U* → [*U*] from F(*L*, ) (as defined in (12.67)) to P(*L*)/ ∼ is a frame map with inverse [*U*] → A *U*. The idea behind the isomorphism (12.69) is that the map A picks a unique representative in the equivalence class [*U*], namely A *U*. As in (12.40) - (12.71), also here we have a canonical map

$$f: L \to \mathcal{P}(L, \lhd);\tag{12.70}$$

$$
\mathfrak{x} \mapsto \mathfrak{x}'(\downarrow \mathfrak{x}), \tag{12.71}
$$

which satisfies *f*(*x*) *f*(*U*) if *x U*. In fact, *f* is universal with this property, in that any homomorphism *g* : *L* → G of meet semilattices into a frame G such that *<sup>g</sup>*(*x*) *g*(*U*) whenever *<sup>x</sup> <sup>U</sup>* has a factorisation *<sup>g</sup>* <sup>=</sup> <sup>ϕ</sup> ◦ *<sup>f</sup>* for some unique frame map ϕ : F(*L*,*C*) → G . This may suggest the following result:

Proposition 12.10. *Suppose one has a frame* F *and a meet semilattice L with a map f* : *L* → F *of meet semilattices that generates* F *in the sense that for each <sup>U</sup>* <sup>∈</sup> <sup>F</sup> *one has U* <sup>=</sup> { *<sup>f</sup>*(*x*) <sup>|</sup> *<sup>x</sup>* <sup>∈</sup> *<sup>L</sup>*, *<sup>f</sup>*(*x*) <sup>≤</sup> *<sup>U</sup>*}*. Define a cover relation on L by*

$$\ge \lnot U \; \sharp f(\mathbf{x}) \leqslant \bigvee f(U). \tag{12.72}$$

*Then one has a frame isomorphism* F ∼= F(*L*, )*.*

We now turn to maps between frames, from the point of view of coverings.

Definition 12.11. *Let* (*L*, ) *and* (*M*, ) *be meet semilattices with covering relation as above, and let f* <sup>∗</sup> : *L* → P(*M*) *be such that:*

$$\begin{array}{l} \text{I. } f^\*(L) = M;\\ \text{2. } f^\*(\mathbf{x}) \land f^\*(\mathbf{y}) \bullet f^\*(\mathbf{x} \land \mathbf{y});\\ \text{3. } \mathbf{x} \lhd U \Rightarrow f^\*(\mathbf{x}) \bullet f^\*(U) \text{ (where } f^\*(U) = \bigcup\_{u \in U} f(U)). \end{array}$$

*If L and M have top elements* #*<sup>L</sup> and* #*M, respectively, then the first condition may be replaced by f* <sup>∗</sup>(#*L*) = #*M. Define two such maps f* <sup>∗</sup> <sup>1</sup> , *f* <sup>∗</sup> <sup>2</sup> *to be equivalent if f* ∗ <sup>1</sup> (*x*) ∼ *f* <sup>∗</sup> <sup>2</sup> (*x*) *(i.e., f* <sup>∗</sup> <sup>1</sup> (*x*) *f* ∗ <sup>2</sup> (*x*) *and f* <sup>∗</sup> <sup>2</sup> (*x*) *f* ∗ <sup>1</sup> (*x*)*) for all x* ∈ *L. A continuous map f* : (*M*, ) → (*L*, ) *is an equivalence class of such maps f* <sup>∗</sup> : *L* → P(*M*)*.*

Our main interest in continuous maps lies in the following result:

Proposition 12.12. *Each continuous map f* : (*M*, ) → (*L*, ) *is equivalent to a frame map* F(*f*) : F(*L*, ) → F(*M*, )*, given by*

$$\mathcal{J}^{\mathfrak{F}}(f) : U \mapsto \mathcal{J}f^\*(U). \tag{12.73}$$

We may now equip *LA* with the covering relation defined by (12.72), given (12.60) and the ensuing map (12.57). Consequently, by Proposition 12.10 one has

$$\mathcal{O}(\Sigma) \cong \mathcal{P}(L\_A, \lnot),\tag{12.74}$$

which yields the following expression for the constructive Gelfand spectrum:

$$\mathcal{O}(\Sigma) \cong \{ U \in \mathbf{D}(L\_A) \mid \mathbf{x} \lhd U \Rightarrow \mathbf{x} \in U \}. \tag{12.75}$$

This lattice becomes computable through a lemma that is crucial for what follows:

Lemma 12.13. *In any topos, the covering relation on LA defined by* (12.72) *with* (12.60) *and* (12.57)*, is given by* D*<sup>a</sup> U iff for all q* > 0 *there exists a (Kuratowski) finite U*<sup>0</sup> <sup>⊆</sup> *U such that* <sup>D</sup>*a*−*<sup>q</sup> U*0*. If U is directed, this means that there exists* D*<sup>b</sup>* ∈ *U such that* D*a*−*<sup>q</sup>* D*b.*

*Proof.* The easy part is the "⇐" direction: from (12.58) and the assumption we have *f*(D*a*) *f*(*U*) and hence D*<sup>a</sup> U* by definition of the covering relation.

In the opposite direction, assume D*<sup>a</sup> U* and take some *q* > 0. From (the proof of) Lemma 12.7, <sup>D</sup>*<sup>a</sup>* <sup>∨</sup> <sup>D</sup>*q*−*<sup>a</sup>* <sup>=</sup> #, hence *<sup>f</sup>*(*U*) <sup>∨</sup> *<sup>f</sup>*(D*q*−*a*) = #. Since <sup>O</sup>(Σ) is compact, there is a finite *<sup>U</sup>*<sup>0</sup> <sup>⊂</sup> *<sup>U</sup>* for which *<sup>f</sup>*(*U*0) <sup>∨</sup> *<sup>f</sup>*(D*q*−*a*) = #, so that by (12.59) we have <sup>D</sup>*<sup>b</sup>* <sup>∨</sup>D*q*−*<sup>a</sup>* <sup>=</sup> #, with <sup>D</sup>*<sup>b</sup>* <sup>=</sup> *U*0. By (12.46) we have

$$
\mathsf{D}\_{a-q} \wedge \mathsf{D}\_{q-a} = \bot,\tag{12.76}
$$

and hence

$$\mathsf{D}\_{a-q} = \mathsf{D}\_{a-q} \land \top = \mathsf{D}\_{a-q} \land (\mathsf{D}\_b \lor \mathsf{D}\_{q-a}) = \mathsf{D}\_{a-q} \land \mathsf{D}\_b \leqslant \mathsf{D}\_b = \bigvee U\_0. \qquad \square$$

If *A* is finite-dimensional, *LA* is a finite lattice. In that case, since D*a*−*<sup>q</sup>* = D*<sup>a</sup>* for small enough *<sup>q</sup>*, one simply has *<sup>x</sup> <sup>U</sup>* iff *<sup>x</sup>* <sup>≤</sup> *U*, and the condition *<sup>x</sup> <sup>U</sup>* <sup>⇒</sup> *<sup>x</sup>* <sup>∈</sup> *<sup>U</sup>* in (12.75) holds iff *U* is a (principal) down set, i.e. *U* =↓*x* for some *x* ∈ *LA* (not the same *x* as the placeholder *x* in (12.75)). Hence for finite-dimensional *A* we obtain

$$\mathcal{O}\left(\Sigma(A)\right) \cong \text{Idl}(L\_A) = \{\downarrow x \mid x \in L\_A\}.\tag{12.77}$$

#### 12.3 Internal Gelfand spectrum and intuitionistic quantum logic

We are now going to combine the (*a priori* independent) material in the previous two sections. The point of the above description of the topology O(Σ(*A*)) of the Gelfand spectrum Σ(*A*) of a unital commutative C\*-algebra *A* is that it may be "internalized" to any topos (with natural number object, i.e., in which C\*-algebras may be defined internally in the first place). The key to the ensuing generalization of Gelfand duality is that in topos theory (and more generally in constructive mathematics) the *space* Σ(*A*) in set theory needs to be replaced by the corresponding *frame* O(Σ(*A*)), or preferably by its associated *locale*, which confusingly is denoted by Σ(*A*), even though it is the same thing as O(Σ(*A*)) and neither may be spatial (in being the topology of some space); see §C.11 and §E.4 for this bizarre notation. Similarly, we write *f* : *X* → *Y* for a map between locales, which is essentially the same as the frame map *<sup>f</sup>* <sup>−</sup><sup>1</sup> : <sup>O</sup>(*Y*) <sup>→</sup> <sup>O</sup>(*X*), but seen as a map in the opposite direction (where once again nothing is assumed about possible spatiality of the frames in question).

Using this notation, the *constructive Gelfand isomorphism* (which is valid in any topos T in which commutative C\*-algebras make sense) states:

Theorem 12.14. *For each (internal) commutative unital C\*-algebra A in* T *there exists a compact regular locale* Σ(*A*) *such that one has a Gelfand isomorphism*

$$A \cong \mathcal{C}(\Sigma(A), \mathbb{C}). \tag{12.78}$$

*Furthermore, the locale* Σ(*A*) *is uniquely determined by A up to isomorphism and its corresponding frame is given by Theorem 12.8 (or, more explicitly, by* (12.75) *in conjunction with Lemma 12.13, all of which makes sense internally).*

Here ∼= denotes (internal) isomorphism of (commutative) C\*-algebras, and the notation *C*(Σ(*A*),C) stands for the object of all frame maps from O(C) to O(Σ(*A*)) (which object turns out to be a commutative C\*-algebra in any case). As usual, we denote the Gelfand transform *A* →*C*(Σ(*A*),C) by *a* → *a*ˆ, where, as explained above, the locale map ˆ*a* : Σ(*A*) → C is really the reverse reading of the frame map

$$
\hat{a}^{-1} : \mathcal{O}(\mathbb{C}) \to \mathcal{O}(\Sigma(A)).\tag{12.79}
$$

Note that in Sets, the latter is given by its literal meaning, given ˆ*a* : ω → ω(*a*).

We will shortly apply this formalism to our internal C\*-algebra *A* in the topos T(*A*), but since these computations are a bit involved, as a warm-up we first apply our machinery to a very simple case, namely *A* = C*<sup>n</sup>* in Sets. Recall (12.44) etc.

For *<sup>A</sup>* <sup>=</sup> <sup>C</sup>*<sup>n</sup>* we have *<sup>A</sup>*<sup>+</sup> = (R*<sup>n</sup>*)+, in which (*r*1,...,*rn*) <sup>≈</sup> (*s*1,...,*sn*) just in case *ri* = 0 iff *si* = 0 for all *i* = 1,...*n*. Hence each equivalence class under ≈ has a unique representative of the form [*k*1,..., *kn*] with *ki* = 0 or *ki* = 1; the pre-images of such an element of *LA* in *<sup>A</sup>*<sup>+</sup> under the natural projection *<sup>A</sup>*<sup>+</sup> <sup>→</sup> *<sup>A</sup>*+/ <sup>≈</sup> are the diagonal matrices whose *i*'th entry is zero if *ki* = 0 and any nonzero positive number if *ki* = 1. The partial order in *LA* is pointwise, i.e. [*k*1,..., *kn*] ≤ [*l*1,...,*ln*] iff *ki* ≤ *li* for all *i*. Hence *<sup>L</sup>*C*<sup>n</sup>* is isomorphic as a distributive lattice to the lattice <sup>P</sup>(*Dn*(C)) <sup>≡</sup> <sup>P</sup>(C*n*) of projections in *Dn*(C), i.e. the lattice of diagonal projections in *Mn*(C).

Under this isomorphism, [*k*1,..., *kn*] corresponds to the matrix diag(*k*1,..., *kn*). If we equip P(C*n*) with the usual partial ordering of projections on the Hilbert space <sup>C</sup>*n*, viz. *<sup>e</sup>* <sup>≤</sup> *<sup>f</sup>* whenever *<sup>e</sup>*C*<sup>n</sup>* <sup>⊆</sup> *<sup>f</sup>* <sup>C</sup>*<sup>n</sup>* (which coincides with their ordering as element of positive cone of the C\*-algebra *Mn*(C)), then this is even a lattice isomorphism. Hence by (12.77), the frame <sup>O</sup>(Σ(C*n*)) consists of all sets of the form <sup>↓</sup>*e*, *<sup>e</sup>* <sup>∈</sup> P(C*n*), partially ordered by inclusion. This means that

$$\mathcal{O}(\Sigma(\mathbb{C}^n)) \cong \mathcal{O}(\mathbb{C}^n),\tag{12.80}$$

under the further identification of <sup>↓</sup> *<sup>p</sup>* <sup>⊂</sup> <sup>P</sup>(C*n*) with *<sup>p</sup>* <sup>∈</sup> <sup>P</sup>(C*n*). This starts out just as an isomorphism of posets, and turns out to be one of frames (which in the case at hand happen to be Boolean). To draw the connection with the usual spectrum <sup>C</sup><sup>ˆ</sup> *<sup>n</sup>* <sup>=</sup> {1,2,...,*n*} of <sup>C</sup>*n*, we note that the right-hand side of (12.80) is isomorphic to the discrete topology O(Cˆ *<sup>n</sup>*) of Cˆ *<sup>n</sup>* (i.e. its power set) under the frame isomorphism

$$\mathcal{O}(\mathbb{C}^n) \xrightarrow{\cong} \mathcal{O}(\hat{\mathbb{C}}^n);$$

$$\text{diag}(k\_1, \dots, k\_n) \mapsto \{i \in \{1, 2, \dots, n\} \mid k\_i = 1\}. \tag{12.81}$$

We now describe the Gelfand transform (12.78) - (12.79) for self-adjoint *a*, so that one has a (locale) map *<sup>A</sup>*sa <sup>→</sup> *<sup>C</sup>*(Σ(*A*),R). Let *<sup>a</sup>* = (*a*1,...,*an*) <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* sa = R*n*. With <sup>Σ</sup>(C*n*) realized as <sup>C</sup><sup>ˆ</sup> *<sup>n</sup>*, this just reads ˆ*a*(*i*) = *ai*, for ˆ*<sup>a</sup>* : <sup>C</sup><sup>ˆ</sup> *<sup>n</sup>* <sup>→</sup> <sup>C</sup>. The induced frame map ˆ*a*−<sup>1</sup> : <sup>O</sup>(C) <sup>→</sup> <sup>O</sup>(C<sup>ˆ</sup> *<sup>n</sup>*) is given by *<sup>U</sup>* → {*<sup>i</sup>* ∈ {1,2,...,*n*} | *ai* <sup>∈</sup> *<sup>U</sup>*}, and by (12.81), this is equivalent to

$$\begin{aligned} \mathcal{d}^{-1}: \mathcal{O}(\mathbb{R}) &\to \mathcal{P}(\mathbb{C}^n); \\ U &\mapsto 1\_U(a), \end{aligned} \tag{12.82}$$

where *U* ∈ O(R), and the right-hand side denotes the spectral projection 1*<sup>U</sup>* (*a*) defined by the self-adjoint operator *a* on the Hilbert space C*n*.

After this warm-up, we now compute the Gelfand spectrum O(Σ(*A*))in our topos T(*A*), for the special case *A* = *Mn*(C) (which is still an exercise for the general case). For simplicity we write *L* for the lattice *LA* in T(*A*); similarly, Σ stands for Σ(*A*).

First, for arbitrary *A*, the lattice functor *L* can be computed "locally", in the sense that *L*0(*C*) = *LC*, see Proposition 12.17 in §12.4 below, so that by (12.44) one has

$$
\underline{L}\_0(\mathcal{C}) = \mathcal{C}^+ / \approx. \tag{12.83}
$$

Let P(*C*) be the (Boolean) lattice of projections in *C*, and consider the functor

$$
\underline{\mathcal{Q}}\_0(\mathbf{C}) = \mathcal{P}(\mathbf{C});\tag{12.84}
$$

$$
\underline{\mathcal{P}}\_1(\mathbb{C} \subseteq D) = (\mathcal{P}(\mathbb{C}) \hookrightarrow \mathcal{P}(D)).\tag{12.85}
$$

As in the case *A* = C*<sup>n</sup>* just discussed, it follows that we may identify *L*0(*C*) with P(*C*) and hence we may and will identify the functor *L* with the functor P.

Second, whereas in Sets eq. (12.77) makes O(Σ) a subset of *L*, in the topos T(*A*) the frame O(Σ) is a subobject O(Σ) Ω*L*. It then follows from (12.11) that O(Σ)(*C*) is a subset of Sub(P↑*C*), the set of subfunctors of the functor P : C (*A*) → Sets restricted to ↑*C* ⊂ C (*A*). To see which subset, define

$$\text{Sub}\_d(\underline{\mathcal{Q}}\_{\uparrow C}) = \{ \tilde{\mathcal{S}} \in \text{Sub}(\underline{\mathcal{Q}}\_{\uparrow C}^{\mathbb{P}}) \mid \forall D \supseteq \mathbb{C} \, \exists \, \mathbf{x}\_D \in \mathcal{P}(D) : \tilde{\mathcal{S}}(D) = \downarrow \mathbf{x}\_D \}. \tag{12.86}$$

Thus Sub*d*(P↑*C*) consists of subfunctors *S* of P↑*<sup>C</sup>* that are locally down-sets. It then follows from (12.77) and the local interpretation of the relation in T(*A*) (see Lemma 12.18 in §12.4 below) that the subobject <sup>O</sup>(Σ) <sup>Ω</sup>*<sup>L</sup>* in <sup>T</sup>(*A*) is the functor

$$\mathcal{O}(\underline{\Sigma})\_0(\mathcal{C}) = \text{Sub}\_d(\underline{\mathcal{Q}}\_{\uparrow \mathcal{C}});\tag{12.87}$$

$$\mathcal{O}(\underline{\Sigma})\_1(C \subseteq D) = (\mathcal{O}(\underline{\Sigma})(C) \hookrightarrow \mathcal{O}(\underline{\Sigma})(D)),\tag{12.88}$$

where O(Σ)<sup>1</sup> is inherited from Ω*<sup>L</sup>* (of which O(Σ) is a subobject), and hence is just given by restricting an element of O(Σ)(*C*) to ↑*D*. Writing

$$\text{Sub}\_d(\underline{\mathcal{Q}}) = \{ \tilde{\mathcal{S}} \in \text{Sub}(\underline{\mathcal{Q}}) \mid \forall D \in \mathcal{C}(A) \; \exists \mathbf{x}\_D \in \mathcal{P}(D) : \tilde{\mathcal{S}}(D) = \downarrow \mathbf{x}\_D \}, \quad (12.89)$$

it is convenient to embed Sub*d*(P↑*C*) ⊆ Sub*d*(P) by requiring elements of the left-hand side to vanish whenever *D* does not contain *C*. We also note that if *S*˜ is to be a subfunctor of <sup>P</sup>↑*C*, one must have *<sup>S</sup>*˜(*D*) <sup>⊆</sup> *<sup>S</sup>*˜(*E*) whenever *<sup>D</sup>* <sup>⊆</sup> *<sup>E</sup>*, and that ↓ *xD* ⊆↓ *xE* iff *xD* ≤ *xE* in P(*E*). Thus one may simply describe elements of O(Σ)(*C*) via maps *S* : C (*A*) → P(*A*) such that:

$$S(D) \in \mathcal{P}(D);\tag{12.90}$$

$$S(D) = 0 \text{ if } D \notin \uparrow C \text{ (i.e. } C \nsubseteq D); \tag{12.91}$$

$$S(D) \le S(E) \text{ if } C \subseteq D \subseteq E. \tag{12.92}$$

The corresponding element *S*˜ of O(Σ)(*C*) is then given by

$$
\tilde{S}(D) = \downarrow S(D), \tag{12.93}
$$

seen as a subset of P(*D*). Hence it is convenient to introduce the notation

$$\mathcal{O}(\Sigma)\_{\uparrow C} = \{ S : \uparrow C \to \mathcal{O}^{\rho}(A) \mid S(D) \in \mathcal{O}^{\rho}(D), S(D) \le S(E) \text{ if } D \subseteq E \}, \quad (12.94)$$

of which we single out the case *C* = C· 1*A*, which will be of great importance:

$$\mathcal{O}(\Sigma) = \{ \mathbb{S} : \mathcal{C}(A) \to \mathcal{P}(A) \mid \mathbb{S}(\mathcal{C}) \in \mathcal{P}(\mathcal{C}), \mathbb{S}(\mathcal{C}) \le \mathbb{S}(D) \text{ if } \mathcal{C} \subseteq D \}. \tag{12.95}$$

Both are posets and even frames in the pointwise partial order with respect to the usual ordering of projections (which algebraically means *e* ≤ *f* iff *e f* = *e*), i.e.,

$$S \le T \iff S(C) \le T(C) \text{ for all } C \in \mathcal{C}(A). \tag{12.96}$$

In terms of (12.94) - (12.95), we then have isomorphisms

$$
\mathcal{O}(\underline{\Sigma})\_0(\mathbb{C}\cdot 1) \cong \mathcal{O}(\Sigma);\tag{12.97}
$$

$$
\mathcal{O}(\underline{\Sigma})(\mathcal{C})\_0 \cong \mathcal{O}(\Sigma)\_{\uparrow \mathcal{C}}.\tag{12.98}
$$

More importantly, the frame O(Σ) in Sets is the key to the *external description* of the *internal frame* O(Σ) in T(*A*); see the end of §E.4. Since C (*A*) carries the Alexandrov topology, by (E.84) this description is given by the frame map

$$
\pi\_{\Sigma}^{-1} : \mathcal{O}(\ell^{\mathcal{O}}(A)) \to \mathcal{O}(\Sigma), \tag{12.99}
$$

given on the basic opens ↑*D* ∈ O(C (*A*)) by

$$\begin{aligned} \pi\_{\Sigma}^{-1}(\uparrow D) = \mathfrak{X}\_{\uparrow D} : E &\mapsto 1 \ (E \supseteq D);\\ E &\mapsto 0 \ (E \not\supseteq D). \end{aligned} \tag{12.100}$$

As explained before, even in Sets, in principle O(Σ) is just a notation for a frame, without suggesting that there exists an underlying space Σ whose topology it is. In this case, however, there is such a space (as we shall show in the next section), and also (12.99) is in fact the inverse image map to a genuine map πΣ : Σ → C (*A*) between spaces (as opposed to the formal notation used for a locale map).

We now state the Heyting algebra structure of O(Σ). First, top and bottom are

$$
\top \backslash (\mathbb{C}) = 1 \text{ for all } \mathbb{C}; \tag{12.101}
$$

$$
\bot(\mathcal{C}) = 0 \text{ for all } \mathcal{C}. \tag{12.102}
$$

The logical operations on O(Σ) may be computed from the partial order as

$$(\mathcal{S} \wedge T)(\mathcal{C}) = \mathcal{S}(\mathcal{C}) \wedge T(\mathcal{C});\tag{12.103}$$

$$(\mathcal{S} \lor T)(\mathcal{C}) = \mathcal{S}(\mathcal{C}) \lor T(\mathcal{C});\tag{12.104}$$

$$(S \dashrightarrow T)(C) = \bigwedge\_{D \supseteq C}^{\mathcal{P}(C)} S(D)^\perp \vee T(D);\tag{12.105}$$

$$S(\neg S)(C) = \bigwedge\_{D \supseteq C}^{\mathcal{P}(C)} S(D)^\perp;\tag{12.106}$$

$$(\neg \neg S)(C) = \bigwedge\_{D \supseteq C}^{\mathcal{P}(C)} \bigvee\_{E \supseteq D}^{\mathcal{P}(C)} S(E),\tag{12.107}$$

where the right-hand side of (12.105) (and similarly (12.106) - (12.107)) is short for

$$\bigwedge\_{D \supseteq C}^{\mathcal{P}(C)} S(D)^\perp \vee T(D) \equiv \bigvee \{ e \in \mathcal{P}(C) \mid e \leq S(D)^\perp \vee T(D) \,\forall D \supseteq C \}. \tag{12.108}$$

Recall that a Heyting algebra is Boolean iff ¬¬*S* = *S* for each *S*. One sees from (12.107) that (at least if *n* > 1) the property ¬¬*S* = *S* only holds iff *S* is either # or ⊥, so that the Heyting algebra O(Σ) ≡ *CO*(Σ(*A*)) is properly intuitionistic.

Since from both a physical and a logical point of view the Heyting algebra O(Σ(*A*)) has vast advantages over the projection lattice P(*A*) of Birkhoff and von Neumann, we propose it as a candidate for a new quantum logic. Let us explain why.

Physically, in von Neumann's approach each projection *e* ∈ P(*A*) defines an elementary proposition, whereas in Bohr's (where the classical context *C* is crucial) an elementary proposition is a *pair* (*C*, *e*), where *e* ∈ P(*C*) is a proposition a la von ` Neumann (who lost sight of the context *C*). If for each such pair (*C*, *e*) we define

$$\mathcal{S}\_{(C,\epsilon)}: \mathcal{C}(\mathcal{A}) \to \mathcal{P}(\mathcal{A});\tag{12.109}$$

$$D \mapsto e \ (C \subseteq D);\tag{12.110}$$

$$D \mapsto \bot \quad \text{otherwise},\tag{12.111}$$

we see that each pair (*C*, *e*) injectively defines an element of O(Σ). Furthermore, each element *S* of O(Σ) is a disjunction over such elementary propositions, since

$$S = \bigvee\_{C \in \mathcal{C}(A)} S\_{(C, S(e))}.\tag{12.112}$$

In contrast to traditional quantum logic, both logical connectives ∧ and ∨ on O(Σ) are physically meaningful, as they only involve *local* conjunctions *S*(*C*)∧*T*(*C*) and disjunctions *S*(*C*)∨*T*(*C*), for which *S*(*C*) ∈ P(*C*) and *T*(*C*) ∈ P(*C*) commute.

Logically, the absence of an implication arrow in quantum logic has always been worrying; this has now been put straight in O(Σ), where  belongs to the defining structure and behaves well logically. Truth attribution in quantum logic is equally suspicious: for any state ω on *A* one declares a proposition *e* ∈ P(*A*) *true* iff ω(*e*) = 1, and *false* iff ω(*e*) = 0, with no verdict otherwise (except probabilistically).

We, however, define a natural *Kripke semantics* (cf. §D.3) on *P* = C (*A*) by

$$V\_{\bullet}: \mathcal{O}(\Sigma) \to \mathsf{Upper}(\mathcal{C}(A)) = \mathcal{O}(\mathcal{C}(A));\tag{12.113}$$

$$V\_{\mathfrak{o}}(\mathbb{S}) = \{ \mathbb{C} \in \mathfrak{G}(A) \mid \mathfrak{o}(S(\mathbb{C})) = 1 \},\tag{12.114}$$

where C (*A*) carries the Alexandrov topology as usual. Note that *V*ω(*S*) indeed defines an upper set in C (*A*), for if *C* ⊆ *D* then *S*(*C*) ≤ *S*(*D*), so that ω(*S*(*C*)) ≤ ω(*S*(*D*)) by positivity of states, and hence ω(*S*(*D*)) = 1 whenever ω(*S*(*C*)) = 1 (given that ω(*S*(*D*)) ≤ 1, which is true since 0 ≤ ω(*e*) ≤ 1 for any projection *e*).

As explained in §D.3, a proposition *S* ∈ O(Σ) is *true* in a state ω if *V*ω(*S*) = C (*A*), i.e. the top element of the frame O(C (*A*)); we also declare it *false* if *V*ω(*S*) = 0, i.e. the bottom element of / O(C (*A*)). Then ¬*S* is true iff *S* is false, and *S*∨*T* is true iff either *S* or *T* is true (since *V*ω(*S*) = C (*A*) iff *S*(C· 1) = 1, which forces *S*(*C*) = 1 for all *C*). Consequently, (12.114) simply lists the contexts *C* in which *S*(*C*) is true.

#### 12.4 Internal Gelfand spectrum for arbitrary C\*-algebras

In this section we compute the internal Gelfand spectrum Σ(*A*) ≡ Σ in T(*A*) for an arbitrary unital C\*-algebra *A*. Recall Definition D.6 (in §D.1) of a free lattice L*<sup>S</sup>* on a set *S*, and its refinement in quotienting by a congruence on L*<sup>S</sup>* explained after that definition. According to Definition E.21, lattices can be defined in any topos. The following "locality lemma" shows that the construction of a free lattice on some object makes sense in functor toposes, and so does its refinement just mentioned, at least as long as the congruence in question is defined through equalities.

Lemma 12.15. *Let* T = [C,Sets] *be any functor topos (where* C *is some category).*

*1. There exists a free distributive lattice* L *<sup>S</sup>* ∈ T *on any object S* ∈ T*, which can be computed locally: the object part of* L *<sup>S</sup> is given by*

$$(\underline{\mathcal{Q}}\_{\underline{\mathcal{S}}})\_0(\mathsf{C}) = \mathcal{Q}\_{\underline{\mathcal{S}}\_0(\mathsf{C})},\tag{12.115}$$

*where* L*S*0(*C*) *defined in* Sets*, and the arrow part is defined as follows. If f* :*C* → *D, then* (L *<sup>S</sup>*)1(*f*) *is the unique arrow making the following diagram commute:*

$$\begin{array}{c} \underline{\mathfrak{S}\_{0}(C)} \xrightarrow{\underline{\mathfrak{S}\_{1}(f)}} \underline{\mathfrak{S}\_{0}(D)}\\ \underline{\mathcal{A}} \Big\downarrow \Big\downarrow \Big\downarrow \Big\downarrow \Big\downarrow \Big\downarrow \Big\downarrow \Big\downarrow \\ \underline{\mathcal{A}}^{\rho}\_{\underline{\mathfrak{S}\_{0}(C)}} \xrightarrow{(\underline{\mathcal{A}}^{\rho}\_{\underline{\mathfrak{S}\_{1}}(f)})} \mathcal{A}^{\rho}\_{\underline{\mathfrak{S}\_{0}(D)}} \end{array} \tag{12.116}$$

*2. The same is true if* L *<sup>S</sup> is subject to relations defined by* equalities *among elements of* L *<sup>S</sup> (as long as these equalities generate a congruence).*

*Proof.* The proof is an elaborate verification, which may be summarized as follows.


We will apply this lemma to T = T(*A*), as in (12.1), with C = C (*A*). This hinges on a lemma of independent interest, which we first state for Sets, i.e., for "ordinary" commutative unital C\*-algebras *A*, to be subsequently internalized to our topos T(*A*).

Lemma 12.16. *The lattice LA in* (12.44) *is (constructively) isomorphic to the lattice L <sup>A</sup> freely generated by the symbols* D*a, a* ∈ *A*sa *and the relations* (12.45) *-* (12.50)*.*

*Proof.* The point is that the map *a* → D*<sup>a</sup>* from *A*sa to *L <sup>A</sup>* is surjective; this follows from the relations (12.45) - (12.50) through their consequences (12.51) - (12.55). The pertinent isomorphism *L <sup>A</sup>* <sup>∼</sup><sup>=</sup> *LA* is then given by mapping <sup>D</sup>*<sup>a</sup>* <sup>↔</sup> [*a*+] on generators (note that in the original discussion of *LA* following (12.44) this map was the *definition* of D*a*; this time, these play an independent role as generators of the lattice *L <sup>A</sup>*, and in the present proof they are *related* to the elements [*a*+] <sup>∈</sup> *LA*). -

Now let *A* be a (not necessarily commutative) unital C\*-algebra (in Sets), with ensuing internal commutative C\*-algebra *A* in the functor topos T(*A*), cf. Theorem 12.3. Our goal is to apply the constructive definition of the Gelfand spectrum Σ(*A*), or rather of its topology O(Σ(*A*)) (seen as a frame, so that Σ(*A*) is seen as a locale) in §12.2 to *A*. The first step concerns the lattice *LA*, which in T(*A*) is denoted by *LA*. Here and in what follows, we try to avoid notational confusion by writing D*<sup>a</sup>* for the formal variable indexed by *a* (which is a variable of type *A* in T(*A*)), whilst writing *Dc* for the actual element [*c*+] of *LC* if we apply (12.44) etc. to *<sup>C</sup>* <sup>∈</sup> <sup>C</sup> (*A*).

Proposition 12.17. *For each C* ∈ C (*A*) *one has*

$$L\_{\underline{A}}(\mathbb{C}) = L\_{\mathbb{C}},\tag{12.117}$$

*where LC is defined in* Sets *through* (12.44) *(with A C), where it may be computed through Lemma 12.16. Furthermore, if C* ⊆ *D, then the map LA*(*C*) → *LA*(*D*) *given by the functoriality of LA, i.e., LC* → *LD, maps each generator Dc in LC (where c* ∈ *C*sa*) to the same generator in LD. This is well defined, because c* ∈ *D*sa*, and this inclusion preserves the relations* (12.45) *-* (12.50)*. We write this as LC* → *LD.*

*Proof.* Internalizing Lemma Lemma 12.16 to our functor topos T(*A*), it follows that the internal lattice *LA* in T(*A*) is isomorphic to a distributive lattice freely generated by generators and relations given by equalities. Hence Lemma 12.15 applies to it. -

The next step is to move from *LA* to the corresponding frame of regular ideals, cf. Theorem 12.8. Abbreviating O(Σ(*A*)) ≡ O(Σ), we first rewrite (12.60) as

$$\mathcal{O}(\Sigma) \cong \{ U \in \text{Idl}(\mathcal{L}\_A) \mid \forall\_{q>0} \mathsf{D}\_{a-q} \in U \Rightarrow \mathsf{D}\_a \in U \}. \tag{12.118}$$

To apply this to our functor topos T(*A*), we apply Kripke–Joyal semantics for the internal language of the topos T(*A*) (which is reviewed §E.5) to the formula D*<sup>a</sup> U*. This is a formula ϕ with two free variables, namely D*<sup>a</sup>* of type *LA*, and *U* of type

$$\mathcal{O}^{\mathcal{P}}(L\_{\mathcal{A}}) \equiv \mathfrak{Q}^{L\_{\mathcal{A}}}.\tag{12.119}$$

Hence in the forcing statement *C* ϕ(α) in T(*A*), we have to insert

$$(\mathfrak{a} \in (\underline{L}\_{\mathsf{A}} \times \underline{\mathfrak{Q}}^{\underline{L}} \mathsf{A})(\mathsf{C}) \cong L\_{\mathsf{C}} \times \mathrm{Sub}(\underline{L}\_{\mathsf{A}|\uparrow \mathsf{C}}),$$

where *LA*|↑*<sup>C</sup>* is the restriction of the functor

$$\underline{L}\_{\underline{A}} \colon \mathcal{C}(A) \to \mathbf{Sets} \tag{12.120}$$

to ↑*C* ⊂ C (*A*). Here we have used (12.117), as well as the isomorphism (12.11). Consequently, we have

$$\mathfrak{a} = (D\_c, \underline{U}), \tag{12.121}$$

where *Dc* ∈ *LC* for some *c* ∈ *C*sa, and *U* :↑*C* → Sets is a subfunctor of *LA*|↑*C*. In particular, *U*(*D*) ⊆ *LD* is defined whenever *D* ⊇ *C*, and the subfunctor condition on *U* simply boils down to *U*(*D*) ⊆ *U*(*E*) whenever *C* ⊆ *D* ⊆ *E*.

Lemma 12.18. *In the topos* T(*A*)*, the cover of Lemma 12.13 may be computed locally, in the sense that for any C* ∈ C (*A*)*, Dc* ∈ *LC and U* ∈ Sub(*LA*|↑*C*)*, one has*

$$
\mathcal{C} \Vdash \mathbf{D}\_a \lhd U(D\_c, \underline{U}) \text{ iff } \mathcal{D}\_c \lhd\_C \underline{U}(\mathcal{C}),
$$

*in that for all q* <sup>&</sup>gt; <sup>0</sup> *there exists a finite U*<sup>0</sup> <sup>⊆</sup> *<sup>U</sup>*(*C*) *such that Dc*−*<sup>q</sup> U*0*.*

*Proof.* We assume that *U*<sup>0</sup> <sup>∈</sup> *<sup>U</sup>*, so that we may replace *<sup>U</sup>*<sup>0</sup> by <sup>D</sup>*<sup>b</sup>* <sup>=</sup> *U*0; the general case is analogous. We then have to inductively analyze the formula D*<sup>a</sup> U*, which, under the stated assumption, in view of Lemma 12.13 may be taken to mean

$$\forall\_{q>0} \exists\_{\mathsf{D}\_b \in L\_A} (\mathsf{D}\_b \in U \land \mathsf{D}\_{a-q} \leqslant \mathsf{D}\_b). \tag{12.122}$$

We now infer from the rules for Kripke–Joyal semantics in a functor topos that

$$C \Vdash (\sf D}\_{\sf T} (\sf D}\_{a} \in U)(\sf D}(\sf D\_{c}, \underline{U}) \tag{12.123}$$

iff for all *D* ⊇ *C* one has *Dc* ∈ *U*(*D*); since *U*(*C*) ⊆ *U*(*D*), this happens to be the case iff *Dc* ∈ *U*(*C*). Furthermore,

$$C \Vdash (\sf D\_b \leqslant \sf D\_a)(D\_{c'}, D\_{c'}) \tag{12.124}$$

iff *Dc Dc* in *LC*. Also,

$$\mathcal{C} \Vdash (\exists\_{\mathbf{D}\_b \in L\_A} \mathbf{D}\_b \in U \land \mathbf{D}\_{a-q} \leqslant \mathbf{D}\_b)(D\_c, \underline{U}) \tag{12.125}$$

iff there is *Dc* ∈ *U*(*C*) such that *Dc*−*<sup>q</sup> Dc* . Finally,

$$\mathcal{C} \Vdash (\forall\_{q>0} \exists\_{\sf D\_b \in L\_{\sf A}} \sf D\_b \in U \land \sf D\_{a-q} \leqslant \sf D\_b)(D\_c, \underline{U}) \tag{12.126}$$

iff for all *D* ⊇ *C* and all *q* > 0 there is *Dd* ∈ *U*(*D*) such that *Dc*−*<sup>q</sup> Dd*, where *Dc* ∈ *LC* is seen as an element of *LD* through the injection *LC* → *LD* of Proposition 12.17, and *U* ∈ Sub(*LA*|↑*C*) is seen as an element of Sub(*LA*|↑*D*) by restriction. This, however, is true at all *D* ⊇ *C* iff it is true at *C*, because *U*(*C*) ⊆ *U*(*D*) and hence one can take *Dd* = *Dc* for the *Dc* ∈ *LC* that makes the condition true at *C*. -

Lemma 12.19. *The spectrum* O(Σ) *of A in* T(*A*) *may be computed as follows:*

*1. At C* ∈ C (*A*)*, the set* O(Σ)(*C*) *consists of those subfunctors U* ∈ Sub(*LA*|↑*C*) *such that for all D* ⊇ *C and all Dd* ∈ *LD one has:*

$$D\_d \lhd\_D \underline{U}(D) \Rightarrow D\_d \in \underline{U}(D).$$

*2. At* C· 1*, the set* O(Σ)(C· 1) *consists of those subfunctors U* ∈ Sub(*LA*) *such that for all C* ∈ C (*A*) *and all Dc* ∈ *LC one has:*

$$D\_c \prec\_{\mathcal{C}} \underline{U}(\mathcal{C}) \Rightarrow D\_c \in \underline{U}(\mathcal{C}).$$

*3. The condition that U* = {*U*(*C*) ⊆ *LC*}*C*∈<sup>C</sup> (*A*) *be a subfunctor of LA comes down to the requirement that:*

$$C \subseteq D \Rightarrow \underline{U}(C) \subseteq \underline{U}(D) \dots$$


$$
\pi^\*\_\Sigma : \mathcal{O}(\ell^\varrho(A)) \to \mathcal{O}(\underline{\Sigma})(\mathbb{C} \cdot 1),
\tag{12.127}
$$

*given on the basic opens* ↑*D* ∈ O(C (*A*)) *by*

$$\pi\_{\Sigma}^\*(\uparrow D) = \mathcal{X}\_{\uparrow D} : E \mapsto \top \ (E \supseteq D);$$

$$E \mapsto \bot \ (E \not\supseteq D), \tag{12.128}$$

*where the top and bottom* #,⊥ *at E are given by* {*LE*} *and* 0/*, respectively.*

*Proof.* By (12.75), O(Σ) is the subobject of Ω*LA* defined by the formula ϕ given by

$$\forall\_{\mathbf{D}\_d \in L\_A} \mathbf{D}\_a \lhd U \Rightarrow \mathbf{D}\_a \in U,\tag{12.129}$$

whose interpretation in T(*A*) is an arrow from Ω*LA* to Ω. In view of (12.11), we may identify an element *U* ∈ O(Σ)(*C*) with a subfunctor of *LA*|↑*C*, and by (12.129) and Kripke–Joyal semantics in functor topoi, we have *U* ∈ O(Σ)(*C*) iff *C* ϕ(*U*), with ϕ given by (12.129). Unfolding this using Kripke–Joyal semantics and using Lemma 12.18 (including part 1 of its proof), we find that *U* ∈ O(Σ)(*C*) iff

$$\forall\_{D \supseteq C} \forall\_{D\_d \in L\_D} \forall\_{E \supseteq D} D\_d \lhd\_E \underline{U}(E) \Rightarrow D\_d \in \underline{U}(E),\tag{12.130}$$

where *Dd* is regarded as an element of *LE*. This condition, however, is equivalent to the apparently weaker condition

$$\forall\_{D \supseteq C} \forall\_{D\_d \in L\_D} D\_d \prec\_D \underline{U}(D) \Rightarrow D\_d \in \underline{U}(D);\tag{12.131}$$

indeed, condition (12.130) clearly implies (12.131), but the latter applied at *D* = *E* actually implies the first, since *Dd* ∈ *LD* also lies in *LE*.

Clauses 2 to 4 should now be obvious. Clause 5 follows by the explicit prescription for the external description of frames (which has been recalled in the previous section, after its initial description the end of §E.4). Note that each O(Σ)(*C*) is a frame in Sets, inheriting the frame structure of the ambient frame Sub(*LA*|↑*C*). -

We now present the computation of O(Σ) ≡ O(Σ(*A*)) for general unital C\* algebras *A*. To explain the final formula, topologize the disjoint union

$$\Sigma^A = \bigsqcup\_{\mathcal{C}\in\mathcal{C}(A)} \Sigma(\mathcal{C}),\tag{12.132}$$

where Σ(*C*) is the Gelfand spectrum of *C* ∈ C (*A*), as follows, abbreviating

$$
\partial \mathcal{U}\_{\mathbb{C}} \equiv \partial \mathcal{U} \cap \Sigma(\mathbb{C}).\tag{12.133}
$$

One has <sup>U</sup> <sup>∈</sup> <sup>O</sup>(Σ*A*) iff the following two conditions are satisfied for all *<sup>C</sup>* <sup>∈</sup> <sup>C</sup> (*A*):

1. U*<sup>C</sup>* ∈ O(Σ(*C*)). 2. For all *D* ⊇ *C*, if λ ∈ U*<sup>C</sup>* and λ ∈ Σ(*D*) such that λ <sup>|</sup>*<sup>C</sup>* <sup>=</sup> <sup>λ</sup>, then <sup>λ</sup> <sup>∈</sup> <sup>U</sup>*D*.

In fact, O(Σ*A*) is simply the weakest topology making the canonical projection

$$
\pi \colon \Sigma^A \to \mathcal{C}(A); \tag{12.134}
$$

$$
\pi(\sigma) = \mathcal{C} \ (\sigma \in \Sigma(\mathbb{C}) \subset \Sigma^A),
\tag{12.135}
$$

continuous with respect to the Alexandrov topology on C (*A*). For *U* ∈ O(C (*A*)),

$$\Sigma\_U^A = \bigsqcup\_{C \in U} \Sigma(C) \tag{12.136}$$

is a subset of Σ*A*, with relative topology inherited from Σ*A*. In particular, for the basic opens *U* = ↑*C* of the Alexandrov topology on C (*A*) we have

$$\Sigma^A\_{\uparrow C} = \bigsqcup\_{D \supseteq C} \Sigma(D). \tag{12.137}$$

Theorem 12.20. *Let A be a unital C\*-algebra A. The internal Gelfand spectrum* O(Σ(*A*)) *of our internal commutative C\*-algebra A in the topos* T(*A*) *is the functor*

$$\mathcal{O}(\underline{\Sigma}(\underline{A}))\_0: \mathcal{C} \mapsto \mathcal{O}(\Sigma^A\_{\uparrow C}),\tag{12.138}$$

*i.e., the frame (in* Sets*) of the open sets of* Σ*<sup>A</sup>* <sup>↑</sup>*<sup>C</sup> in the topology defined after* (12.132)*; if C* ⊆ *D, the arrow-part of the functor in question is given by*

$$\mathcal{O}(\underline{\Sigma}(\underline{A}))\_1 \colon \mathcal{O}(\Sigma^A\_{\uparrow C}) \to \mathcal{O}(\Sigma^A\_{\uparrow D});\tag{12.139}$$

$$
\mathcal{U} \hookrightarrow \mathcal{U} \cap \uparrow D. \tag{12.140}
$$

*Similarly, in the description of* T(*A*) *as the category of sheaves* Sh(C (*A*))*, cf.* (E.84)*, the Gelfand spectrum is given by the sheaf (where U* ⊆ *V in* (12.142)*):*

$$\mathcal{O}(\underline{\Sigma}(\underline{\Lambda}))\_0: U \mapsto \mathcal{O}(\Sigma\_U^A) \ (U \in \mathcal{O}(\ell^\rho(A)));\tag{12.141}$$

$$\mathcal{O}(\underline{\Sigma}(\underline{\mathbf{A}}))\_1 : \mathcal{U} \mapsto \mathcal{U} \cap \Sigma\_U^A \ (\mathcal{U} \in \mathcal{O}(\Sigma\_V^A)).\tag{12.142}$$

*Proof.* The proof is based on Lemma 12.19, which implies that the internal frame RIdl(*LA*) in T(*A*) is given by the functor

$$\underline{\operatorname{RId}}(\underline{L}\_{\underline{\mathbf{L}}}): \mathcal{C} \mapsto \{ \underline{F} \in \operatorname{Sub}(\underline{L}\_{\underline{\mathbf{L}}\uparrow \mathcal{C}}) \mid \underline{F}(D) \in \operatorname{RId}(L\_D) \text{ for all } D \supseteq \mathcal{C} \}. \tag{12.143}$$

Here, since *D* is a commutative unital C\*-algebra in Sets, according to (12.60) the set RIdl(*LD*) may be identified with the topology O(Σ(*D*)), where Σ(*D*) is the Gelfand spectrum of *D* in the usual sense. We will make this identification in the following step, which is the last step of the proof of Theorem 12.20.

Lemma 12.21. *The transformation* θ : RIdl(*LA*) → O(Σ(*A*)) *with components*

$$\begin{aligned} \theta\_{\mathcal{C}} &: \{ \underline{F} \in \text{Sub}(\underline{L}\_{\underline{A} \mid \uparrow \mathcal{C}}) \mid \underline{F}(\mathcal{D}) \in \mathcal{O}(\Sigma(\mathcal{D})) \text{ for all } \mathcal{D} \supseteq \mathcal{C} \} \to \mathcal{O}(\Sigma^{A}\_{\uparrow \mathcal{C}});\\ \underline{F} &\mapsto \bigsqcup\_{\mathcal{D} \supseteq \mathcal{C}} \underline{F}(\mathcal{D}), \end{aligned} \tag{12.144}$$

*is a natural isomorphism of functors—i.e., an isomorphism of objects in* T(*A*)*.*

Since RIdl(*LA*) and O(Σ) are internal frames in T(*A*), it suffices to prove that each θ*<sup>C</sup>* is an isomorphism of frames in Sets. Unfortunately, even this proof is a very lengthy (though straightforward) affair, for which we refer to the literature. -

Corollary 12.22. *The external description (in* Sets*) of the internal locale* Σ(*A*) *(in* T(*A*)*) is given by the canonical projection* (12.134)*.*

Note that both Σ*<sup>A</sup>* and C (*A*) are topological spaces, so that (12.134) is a *bona fide* continuous map between spaces. This is worth stressing, since in general, an external description of an internal locale in a sheaf topos, though defined in Sets, is a map between locales (or, equivalently, between frames) that are not necessarily topological spaces. But in the case (12.134) at hand they are, so at least this time there is no confusion between O(*X*) as both formal notation for a frame (not necessarily coming from a topology) and notation for the topology of a space *X*; see §C.11.

Note that (12.95) is a special case of Theorem 12.20 or Corollary 12.22, for

$$A = M\_n(\mathbb{C}).\tag{12.145}$$

To see this, we identify U = *<sup>C</sup>*∈<sup>C</sup> (*A*) <sup>U</sup>*<sup>C</sup>* as an element of <sup>O</sup>(Σ*A*) with

$$\mathcal{S}: \mathcal{C}(\mathsf{A}) \to \mathcal{P}(\mathsf{A})$$

on the right-hand side of (12.95), where *S*(*C*) ∈ P(*C*) is the image of U*<sup>C</sup>* ∈ O(Σ(*C*)) under the isomorphism O(Σ(*C*)) → P(*C*) between the (discrete) topology of the (finite) Gelfand spectrum of *C* and the (Boolean) projection lattice of *C* derived earlier, see (12.80). Similarly, for *<sup>U</sup>* <sup>∈</sup> <sup>O</sup>(<sup>C</sup> (*A*)), the frame <sup>O</sup>(Σ*<sup>A</sup> <sup>U</sup>* ) may be identified with maps

$$\mathcal{S}: U \to \mathcal{P}(\mathcal{A})$$

satisfying the conditions in (12.95). Of course, the special case (12.145) leading to (12.95) is very appealing, and was well worth treating in its own right!

Theorem 12.20 and Corollary 12.22 also give an explicit description of the general internal Gelfand isomorphism (12.78), whose real part in T(*A*) reads

$$\underline{A}\_{\rm sa} \cong C(\underline{\Sigma}, \underline{\mathbb{R}}) \equiv \text{Frm}(\mathcal{O}(\underline{\mathbb{R}}), \mathcal{O}(\underline{\Sigma})), \tag{12.146}$$

where the right-hand side, which denotes the object of frame homomrphisms from O(R) to O(Σ) within T(*A*), is the *definition* of the middle term (which is just a notation). To understand the situation in T(*A*), one has to distinguish between:


The connection between 1. and 2. is given by λ-conversion, i.e., the bijective correspondence between *<sup>C</sup>* <sup>→</sup> *BA* and *<sup>A</sup>* <sup>×</sup>*<sup>C</sup>* <sup>→</sup> *<sup>B</sup>*, cf. (E.153). Taking *<sup>C</sup>* <sup>=</sup> <sup>1</sup> (i.e. the terminal object in T(*A*)), we see that an *element* of the set Hom(*A*,*B*) corresponds to an *arrow* <sup>1</sup> <sup>→</sup> *BA*. Eq. (12.8) yields

$$\text{Frm}(\mathcal{O}(\underline{\mathbb{R}}), \mathcal{O}(\underline{\Sigma}))(\mathcal{C}) = \text{Nat}\_{\text{Frm}}(\mathcal{O}(\underline{\mathbb{R}})\_{\uparrow\mathcal{C}}, \mathcal{O}(\underline{\Sigma})\_{\uparrow\mathcal{C}}),\tag{12.147}$$

the set of all natural transformations between the functors O(R) and O(Σ), both restricted to ↑*C* ⊂ C (*A*), that are frame maps. This set can be computed from the external description of frames and frame maps in §E.4. Recall (12.4) etc. The frame O(R)↑*<sup>C</sup>* has external description

$$
\pi\_{\mathbb{R}}^{-1} \,:\, \mathcal{O}(\uparrow \mathcal{C}) \to \mathcal{O}(\uparrow \mathcal{C} \times \mathbb{R}),
\tag{12.148}
$$

where π<sup>R</sup> :↑*C* × R →↑*C* is projection on the first component. The special case *C* = C· 1 yields the external description of O(R) itself, namely

$$
\pi\_{\mathbb{R}}^{-1} \,:\, \mathcal{O}(\ell^{\diamond}(A)) \to \mathcal{O}(\ell^{\diamond}(A) \times \mathbb{R}),\tag{12.149}
$$

where this time (with abuse of notation) the projection is π<sup>R</sup> : C (*A*)×R → C (*A*). By Corollary 12.22, the Gelfand frame O(Σ)↑*<sup>C</sup>* has external description

$$
\pi\_{\Sigma}^{-1} : \mathcal{O}(\uparrow \mathcal{C}) \to \mathcal{O}(\Sigma)\_{\uparrow \mathcal{C}}, \tag{12.150}
$$

given by (12.128), with the understanding that *D* ⊇ *C* (the special case *C* = C · 1 then recovers the external description (12.99) of O(Σ) itself). It follows that there is a bijective correspondence between two classes of frame maps:

$$
\underline{\mathfrak{g}}\_{\mathcal{C}}^{-1} : \mathcal{O}(\underline{\mathbb{R}})\_{\uparrow\mathcal{C}} \to \mathcal{O}(\underline{\Sigma})\_{\uparrow\mathcal{C}} \text{ (in } \mathsf{T}(A)\text{)}; \tag{12.151}
$$

$$\mathfrak{q}\_{\mathbb{C}}^{-1} : \mathcal{O}(\uparrow \mathcal{C} \times \mathbb{R}) \to \mathcal{O}(\Sigma)\_{\uparrow \mathcal{C}} \text{ (in } \mathbf{Sets}\text{)},\tag{12.152}$$

where ϕ*<sup>C</sup>* must satisfy the condition that for any *D* ⊇ *C*,

12.4 Internal Gelfand spectrum for arbitrary C\*-algebras 483

$$
\mathfrak{g}\_{\mathbb{C}}^{-1}(\uparrow D \times \mathbb{R}) = \mathfrak{X}\_{\uparrow D}. \tag{12.153}
$$

Indeed, such a map ϕ−<sup>1</sup> *<sup>C</sup>* defines an element ϕ−<sup>1</sup> *<sup>C</sup>* of Nat(O(R)↑*C*,O(Σ)↑*C*) in the obvious way: for *D* ∈↑*C*, the components

$$\underline{\mathfrak{g}}\_{\mathcal{C}}^{-1}(D) : \mathcal{O}(\underline{\mathbb{R}})(D) \to \mathcal{O}(\underline{\Sigma})(D) \tag{12.154}$$

of the natural transformation ϕ−<sup>1</sup> *<sup>C</sup>* , i.e.

$$
\underline{\mathfrak{g}}\_{\mathcal{C}}^{-1}(D) : \mathcal{O}(\uparrow D \times \mathbb{R}) \to \mathcal{O}(\Sigma)\_{\uparrow D}, \tag{12.155}
$$

are simply given by the restriction of ϕ−<sup>1</sup> *<sup>C</sup>* to O(↑*D*×R) ⊂ O(↑*C*×R); cf. (E.147). This is consistent, because (12.153) implies that for any *U* ∈ O(R) and *C* ⊆ *D* ⊆ *E*,

$$
\mathfrak{op}\_{\mathbb{C}}^{-1}(\uparrow E \times U)(F) \le \mathfrak{op}\_{\mathbb{C}}^{-1}(\uparrow D \times \mathbb{R})(F),\tag{12.156}
$$

which by (12.153) vanishes whenever *F D*. Consequently,

$$
\mathfrak{sp}\_C^{-1}(\uparrow E \times U)(F) = 0 \,\text{if} \, F \not\equiv \not\!\!D,\tag{12.157}
$$

so that ϕ−<sup>1</sup> *<sup>C</sup>* (*D*) actually takes values in O(Σ)↑*<sup>D</sup>* (rather than in O(Σ)↑*C*, as might be expected). Denoting the set of frame maps (12.152) that satisfy (12.153) by Frm (O(↑*C* ×R),O(Σ)↑*C*), we obtain a functor

$$\text{Frm}'(\mathcal{O}(\uparrow(-) \times \mathbb{R}), \mathcal{O}(\Sigma)\_{\uparrow-}) : \mathcal{C}(A) \to \mathbf{Sets},\tag{12.158}$$

with the stipulation that for *C* ⊆ *D* the induced map

$$\operatorname{Frm}^{\prime}(\mathcal{O}(\uparrow \mathcal{C} \times \mathbb{R}), \mathcal{O}(\Sigma)\_{\uparrow \mathcal{C}}) \to \operatorname{Frm}^{\prime}(\mathcal{O}(\uparrow D \times \mathbb{R}), \mathcal{O}(\Sigma)\_{\uparrow D})$$

is given by restricting an element of the left-hand side to O(↑*D*×R) ⊂ O(↑*C*×R); this is consistent by the same argument (12.157).

The Gelfand isomorphism (12.78) is therefore a natural transformation

$$\underline{\mathbf{A}} \stackrel{\cong}{\longrightarrow} \operatorname{Frm}^{\prime}(\mathcal{O}(\uparrow - \times \mathbb{R}), \mathcal{O}(\Sigma)\_{\uparrow -}),\tag{12.159}$$

which means that one has a compatible (i.e. natural) family of isomorphisms

$$\mathcal{C} \stackrel{\cong}{\longrightarrow} \text{Frm}'(\mathcal{O}(\uparrow \mathcal{C} \times \mathbb{R}), \mathcal{O}(\Sigma)\_{\uparrow \mathcal{C}});$$

$$a \mapsto \hat{a}^{-1} : \mathcal{O}(\uparrow \mathcal{C} \times \mathbb{R}) \to \mathcal{O}(\Sigma)\_{\uparrow \mathcal{C}}.\tag{12.160}$$

On basic opens ↑*D*×*U* ∈ O(↑*C* ×R), with *D* ⊇ *C*, we obtain

$$\begin{aligned} \hat{a}^{-1}(\uparrow D \times U) &: E \mapsto 1\_U(a) \text{ if } E \supseteq D;\\ E &\mapsto 0 \text{ if } E \not\supseteq D. \end{aligned} \tag{12.161}$$

Here 1*<sup>U</sup>* (*a*) is the spectral projection of *a* in *U*, cf. (12.82); as it lies in P(*C*) and *C* ⊆ *D* ⊆ *E*, the projection 1*<sup>U</sup>* (*a*) certainly lies in P(*E*), as required. Furthermore, we need to extend ˆ*a*−<sup>1</sup> to general opens in <sup>↑</sup>*<sup>C</sup>* <sup>×</sup><sup>R</sup> by the frame map property, and note that (12.153) for ϕ−<sup>1</sup> *<sup>C</sup>* = *a*ˆ <sup>−</sup><sup>1</sup> is satisfied.

This analysis also holds in the topos Sh(C (*A*)) of sheaves in C (*A*) (as always, equipped with the Alexandrov topology, cf. (E.84). It then follows from (12.159) and (12.141) that as a sheaf,

$$\mathcal{C}(\underline{\Sigma}, \underline{\mathbb{C}}): U \mapsto \mathcal{C}(\Sigma\_U^A, \mathbb{C}), \tag{12.162}$$

where Σ*<sup>A</sup> <sup>U</sup>* is given by (12.136); if *<sup>U</sup>* <sup>⊆</sup>*V*, the map *<sup>C</sup>*(Σ*<sup>A</sup> <sup>V</sup>* ,C) <sup>→</sup>*C*(Σ*<sup>A</sup> <sup>U</sup>* ,C) is given by the pullback of the inclusion Σ*<sup>A</sup> <sup>U</sup>* <sup>→</sup> <sup>Σ</sup>*<sup>A</sup> <sup>V</sup>* (that is, by restriction). It then follows from (12.162) that the isomorphism (12.146) is given by its components

$$\underline{A}(U) \cong \mathcal{C}(\Sigma\_U^A, \mathbb{C}). \tag{12.163}$$

In particular, the component of the natural isomorphism in (12.146) at *U* = ↑*C* is

$$\mathcal{C} \cong \mathcal{C}(\Sigma^A\_{\uparrow C}, \mathbb{C}). \tag{12.164}$$

A glance at the topology of Σ*<sup>A</sup>* shows that the so-called *Hausdorffication*, which for a general compact space may be defined either directly, or C\*-algebraically by *X<sup>H</sup>* = Σ(*C*(*X*)), and coincides with the left adjoint of the forgetful functor from the category of compact Hausdorff spaces (and continuous maps) to the category of compact spaces (and continuous maps), is given by (Σ*<sup>A</sup>* <sup>↑</sup>*C*)*<sup>H</sup>* <sup>∼</sup><sup>=</sup> <sup>Σ</sup>(*C*), so that

$$\mathbf{C}(\boldsymbol{\Sigma}\_{\uparrow\mathbf{C}}^{A}, \mathbb{C}) \cong \mathbf{C}(\boldsymbol{\Sigma}(\mathbf{C}), \mathbb{C}),\tag{12.165}$$

where the isomorphism is given by restricting *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(Σ*<sup>A</sup>* <sup>↑</sup>*C*,C) to <sup>Σ</sup>(*C*) <sup>⊂</sup> <sup>Σ</sup>*<sup>A</sup>* ↑*C*.

Corollary 12.23. *The internal Gelfand isomorphism*

$$\underline{A} \stackrel{\cong}{\longrightarrow} \mathcal{C}(\underline{\Sigma}, \underline{\mathbb{C}}), \tag{12.166}$$

*which is a natural isomorphism between functors* C (*A*) → Sets*, is given at each C* ∈ C (*A*) *by the usual Gelfand isomorphism for the commutative C\*-algebra C:*

$$\underline{A}\_{0}(\mathcal{C}) = \mathcal{C} \stackrel{\cong}{\longrightarrow} \mathcal{C}(\Sigma(\mathcal{C}), \mathbb{C}) \cong \mathcal{C}(\underline{\Sigma}, \underline{\mathbb{C}})\_{0}(\mathcal{C}).\tag{12.167}$$

At the end of the day, the Gelfand isomorphism (12.146) therefore turns out to simply assemble all isomorphisms (12.167) for the commutative C\*-subalgebras *C* of *A* into a single sheaf-theoretic construction. Incidentally, taking *C* = C · 1 in (12.164) shows that (Σ*A*)*<sup>H</sup>* is a point, which is also obvious from the fact that any open set containing the point <sup>Σ</sup>(C· <sup>1</sup>) of <sup>Σ</sup>*<sup>A</sup>* must be all of <sup>Σ</sup>*A*.

#### 12.5 "Daseinisation" and Kochen–Specker Theorem

The internal Gelfand transform (12.166) constructed in the previous section acts on each commutative subalgebra *A* ∈ C (*A*). What about *A* itself? There is a more subtle transform, inspired by the remarkable "Daseinisation" construction of Doring and ¨ Isham (whose name has unfortunately been inspired by the controversial German philosopher Heidegger), which turns self-adjoint elements *a* of *A* into continuous functions δ(*a*) on the topos-theoretical phase space Σ*A*, whose range is the so-called *interval domain* IR (which is a fuzzy version of R). Hence we will define a map

$$\delta: A\_{\rm sa} \to C(\Sigma^A, \mathbb{IIR}), \tag{12.168}$$

which, alas, is defined only if *A* is a von Neumann algebra; we shall therefore assume this throughout this section. Similarly, the notation C (*A*) will now stand for the poset of abelian *von Neumann* subalgebras of *A* (as opposed to abelian *C\** subalgebras of *A*, as in the remainder of this book).

"Daseinisation" requires two slightly unusual concepts, the first of which is the said *interval domain* IR. To motivate its definition, consider Brouwer's approximation of real numbers by nested intervals with endpoints in Q. For example, the real number π can be described by specifying the sequence

$$[3, 4], [3.1, 3.2], [3.14, 3.15], [3.141, 3.142], \dots$$

This description of the reals is formalized by IR, defined as the poset whose elements are *compact* intervals [*a*,*b*] in R (including singletons [*a*,*a*] = {*a*}), ordered by *reverse* inclusion (for a smaller interval means that we have more information about the real number that the ever smaller intervals converge to). This poset is a so-called *dcpo* (for *directed complete partial order*); directed suprema are simply intersections. As such, it carries the *Scott topology*, whose open sets are upper subsets *<sup>U</sup>* of IR with the additional property that for every directed set *<sup>D</sup>* with *<sup>D</sup>* <sup>∈</sup> *<sup>U</sup>* the intersection *D* ∩*U* is nonempty. This means that each open interval (*p*,*q*) in R (with *p* = −∞ and *q* = +∞ allowed) corresponds to a Scott open

$$U\_{(p,q)} = \{ [a,b] \mid p < a \le b < q \}. \tag{12.169}$$

Indeed, these opens form a basis of the Scott topology OScott(IR) ≡ O(IR) of IR. This topology is, of course, a frame, so far defined in Sets. However, this frame is easily internalized to any (pre)sheaf topos, similar to the Dedekind reals (12.3) - (E.149); in particular, in T(*A*) we have

$$\mathcal{O}(\underline{\mathbb{IIR}})\_0 : \mathcal{C} \mapsto \mathcal{O}((\uparrow \mathcal{C}) \times \mathbb{IIR}),\tag{12.170}$$

with external description as a locale (see §E.4) given by the canonical projection

$$
\pi\_1 : \ell^\diamond(A) \times \mathbb{IIR} \to \ell^\diamond(A). \tag{12.171}
$$

The second ingredient of "Daseinisation" is the *spectral order* on *A*sa. The partial order ≤ defined in §C.7 (in which *a* ≤ *b* iff ω(*a*) ≤ ω(*b*) for all states ω on *A*) has good linearity properties in that it makes *A*<sup>+</sup> a convex cone in the real vector space *A*sa (cf. Definition C.50), but it is terrible from a lattice point of view (unless *A* is abelian): for example, for *A* = *B*(*H*), suprema *a*∨*b* and infima *a*∧*b* exist iff either *a* ≤ *b* or *b* ≤ *a* (and indeed *A*sa is a lattice with respect to ≤ iff *A* is abelian). However, there is a different order on *A*sa that turns it into a *conditionally* (or *boundedly*)*complete lattice*, i.e., a poset *X* with the property that if some subset *S* ⊆ *X* has an upper bound (i.e., there is *x* ∈ *X* such that *s* ≤ *x* for each *s* ∈ *S*), then it has a *lowest* upper bound (i.e., *S* exists), and similarly for (greatest) lower bounds.

Definition 12.24. *For a*,*b* ∈ *A*sa *we say that a* ≤*<sup>s</sup> b (i.e., a is less or equal than b in the* spectral order*) iff an* <sup>≤</sup> *<sup>b</sup><sup>n</sup> for each n* <sup>∈</sup> <sup>N</sup>*.*

It can be shown that *a* ≤*<sup>s</sup> b* iff *e* (*b*) (λ) ≤ *e* (*a*) (λ) for each λ ∈ R (note the change of order), where *e* (*a*) (λ) is the spectral projection 1(−∞,λ]∩σ(*<sup>a</sup>*)(*a*), etc. This, in turn, implies, that

$$a \le\_s b \text{ iff } \mu\_{\mathfrak{o}}(a \le \lambda) \ge \mu\_{\mathfrak{o}}(b \le \lambda),\tag{12.172}$$

for each (normal) state ω on *A* and each λ ∈ R, where

$$\mu\_{\mathfrak{o}}(a \le \lambda) = \mathfrak{o}(1\_{(\!-\circ,\lambda] \cap \sigma(a)}(a))\tag{12.173}$$

is the Born probability for the outcome *a* ≤ λ in state ω (and similarly for *b*). Furthermore, if *a* and *b* commute, or if *a* and *b* are both projections, the *a* ≤*<sup>s</sup> b* iff *a* ≤ *b*, i.e., ≤*<sup>s</sup>* coincides with the usual partial order ≤ iff *A* is abelian, and ≤*<sup>s</sup>* restricts to ≤ on the projection lattice P(*A*) of *A*. For each *a* ∈ *A*sa and *C* ∈ C (*A*), we define

$$\delta\_C^l(a) = \bigvee\{b \in \mathcal{C}\_{\text{sa}} \mid b \le\_s a\};\tag{12.174}$$

$$\delta\_{\mathbb{C}}^{o}(a) = \bigwedge \{ b \in \mathbb{C}\_{\text{sa}} \mid a \le\_{s} b \}, \tag{12.175}$$

called the *inner* and *outer Daseinisation* of *a* with respect to *C*, respectively; those objecting to Heidegger might prefer to simply call these the inner and outer *localizations* of *a* with respect to *C*. For projections, these expressions simplify to

$$\delta\_{\mathbb{C}}^{i}(e) = \bigvee \{ f \in \mathcal{P}(\mathbb{C}) \mid f \leq\_{s} e \};\tag{12.176}$$

$$\delta\_{\mathbf{C}}^{o}(e) = \bigwedge \{ f \in \mathcal{J}^{\bar{o}}(\mathbf{C}) \mid e \le\_{s} f \}, \tag{12.177}$$

and in fact one has a very nice categorical description, in that δ*<sup>i</sup> <sup>C</sup>* : P(*A*) → P(*C*) and δ *<sup>o</sup> <sup>C</sup>* : P(*A*) → P(*C*) are the right and left adjoint, respectively, of the inclusion functor P(*C*) → P(*A*) in the category of complete orthomodular lattices.

We are now in a position to define the map (12.168): for *a* ∈ *A*sa we put

$$\delta(a) : (\mathcal{C}, \mathfrak{a}) \mapsto [\mathfrak{o}(\mathfrak{d}\_{\mathcal{C}}^{i}(a)), \mathfrak{o}(\mathfrak{d}\_{\mathcal{C}}^{o}(a))],\tag{12.178}$$

where (as the notation indicates) the point (*C*,ω) <sup>∈</sup> <sup>Σ</sup>(*C*) <sup>⊂</sup> <sup>Σ</sup>*<sup>A</sup>* is just <sup>ω</sup> <sup>∈</sup> <sup>Σ</sup>(*C*).

It is easily checked that the right-hand side of (12.178) makes sense, since positivity of states and (12.174) - (12.175) obviously imply ω(δ*<sup>i</sup> <sup>C</sup>*(*a*)) <sup>≤</sup> <sup>ω</sup>(<sup>δ</sup> *<sup>o</sup> <sup>C</sup>*(*a*)). Also, δ(*a*) is continuous, so that δ is well defined. If we define a closely related map

$$
\hat{\mathfrak{G}}(a) : \Sigma^A \to \mathcal{C}(A) \times \mathbb{IIR};\tag{12.179}
$$

$$
\hat{\delta}(a)(C,\mathfrak{o}) = (C,\delta(a)(C,\mathfrak{o})),\tag{12.180}
$$

then ˆ δ(*a*) is the external description of an internal locale map

$$
\underline{\delta}(a) : \underline{\Sigma}(\underline{\Lambda}) \to \underline{\Pi}\underline{\mathbb{R}}.\tag{12.181}
$$

In view of this, we may regard (12.168) as a hybrid (i.e. "category mistake") map

$$
\underline{\delta}: A\_{\rm sa} \to C(\underline{\Sigma}(\underline{A}), \underline{\mathbb{I}} \mathbb{R}); \tag{12.182}
$$

see the text below (12.146), with R IR, for the meaning of the right-hand side.

The relationship between δ and the Gelfand transform (12.166) is as follows. For *a* ∈ *A*sa, let *W*∗(*a*) be the unital commutative von Neumann algebra generated by *a* = *a*<sup>∗</sup> and 1*<sup>A</sup>* within *A*. Using (12.164), we then have a Gelfandish isomorphism

$$W^\*(a)\_{\rm sa} \stackrel{\simeq}{\longrightarrow} C(\Sigma^{A}\_{\uparrow W^\*(a)}, \mathbb{R});\tag{12.183}$$

$$c \mapsto \hat{c}.\tag{12.184}$$

In particular, since *a* ∈ *W*∗(*a*), we obtain a continuous function

$$\hat{a}: \Sigma^{A}\_{\uparrow W^\*(a)} \to \mathbb{R}. \tag{12.185}$$

Furthermore, we have an inclusion

$$
\mathfrak{a} : \mathbb{IR} \hookrightarrow \mathbb{IR}; \tag{12.186}
$$

$$\mathbf{x} \mapsto [\mathbf{x}, \mathbf{x}], \tag{12.187}$$

which is continuous, and hence induces a map *<sup>C</sup>*(Σ*A*,R) <sup>→</sup> *<sup>C</sup>*(Σ*A*,IR), as well as maps *C*(Σ*<sup>A</sup>* ↑*W*∗(*a*) ,R) <sup>→</sup> *<sup>C</sup>*(Σ*<sup>A</sup>* ↑*W*∗(*a*) ,IR). Then the following diagram commutes:

$$\begin{array}{c} \Sigma^{A}\_{\uparrow W^\*(a)} \xrightarrow{\delta(a)} \Pi \mathbb{R} \\ \searrow \searrow \Big\uparrow \Big\uparrow \end{array}\_{\mathbb{R}} \tag{12.188}$$

In words, the restriction of the "Daseinisation" <sup>δ</sup>(*a*) : <sup>Σ</sup>*<sup>A</sup>* <sup>→</sup> IR of *<sup>a</sup>* to the open subset Σ*<sup>A</sup>* <sup>↑</sup>*W*∗(*a*) <sup>⊂</sup> <sup>Σ</sup>*<sup>A</sup>* takes values in <sup>R</sup> <sup>⊂</sup> IR, and as such coincides with the Gelfand transform ˆ*a* of *a*, seen as a map (12.185). Hence, as might be expected in quantum mechanics, any fuzziness of δ(*a*) is only noticeable outside its own context *W*∗(*a*).

The "Daseinisation" construction enables one to interpret propositions *a* ∈ (*p*,*q*) as open subsets of the "phase space" <sup>Σ</sup>*A*, as in classical physics, where *<sup>a</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>R</sup> would be a continuous function on a phase space *X*, and one would say that

$$[[a \in (p, q)]]\_{\text{CM}} = a^{-1}(p, q) \in \mathcal{O}(X). \tag{12.189}$$

In quantum mechanics, one would interpret *a* ∈ (*p*,*q*) as the spectral projection

$$[[a \in (p, q)]]\_{\text{QM}} = e^{(a)}\_{(p, q)} \equiv 1\_{(p, q) \cap \sigma(a)}(a),\tag{12.190}$$

or, equivalently, with the corresponding closed subset of the ambient Hilbert space. In our quantum toposophy setting, however, we may adapt (12.189) as

$$[([a \in (p,q)])]\_{\mathrm{QT}} = \delta(a)^{-1}(U\_{(p,q)}) \in \mathcal{O}(\Sigma^A). \tag{12.191}$$

Similarly, one may interpret *a* ∈ (*p*,*q*) as an internal open subset of the internal Gelfand spectrum Σ(*A*), as follows. For any locale *Y* in a topos T, an internal open in O(*Y*) is defined as an arrow 1 → O(*Y*), where as usual 1 is the terminal object in T. In the case at hand we have *Y* = Σ(*A*), and use the composition

$$\underline{\mathbf{1}} \stackrel{(p,q)}{\Longrightarrow} \mathcal{O}(\underline{\Pi}\underline{\mathbf{R}}) \stackrel{\underline{\delta}(a)^{-1}}{\longrightarrow} \mathcal{O}(\underline{\Sigma}(\underline{\mathbf{A}})),\tag{12.192}$$

where the natural transformation (*p*,*q*) has components

$$\underline{\mathbf{(p,q)}}\_{C}(\*) = \uparrow \mathbf{C} \times U\_{(p,q)},\tag{12.193}$$

cf. (12.170), and <sup>δ</sup>(*a*)−<sup>1</sup> : <sup>O</sup>(IR) <sup>→</sup> <sup>O</sup>(Σ(*A*)) is the frame version of the locale map (12.181), whose component at *C*, i.e.,

$$\underline{\mathfrak{G}}(a)\_C^{-1} : \mathcal{O}((\uparrow C) \times \mathbb{I} \mathbb{R}) \to \mathcal{O}(\Sigma^A\_{\uparrow C}),\tag{12.194}$$

is given on basic opens in (↑*C*)×IR, with *D* ⊇ *C* and *p* < *q*, by

$$
\underline{\delta}(a)\_C^{-1}(\uparrow D \times U\_{(p,q)}) = \delta(a)^{-1}(U\_{(p,q)}) \cap \Sigma^A\_{\uparrow D}.\tag{12.195}
$$

We therefore obtain the quantum-toposophical interpretation of *a* ∈ (*p*,*q*) as:

$$[[\underline{a}\in(p,q)]]\_{\mathrm{QT}}:\underline{\mathbf{1}}\to\mathcal{O}(\underline{\mathbf{2}}(\underline{\mathbf{A}}));\tag{12.196}$$

$$[\underline{\left[a \in (p,q)\right]}]\_{\mathrm{QT}} = \underline{\mathfrak{S}}(a)^{-1} \circ \underline{\left(p,q\right)}.\tag{12.197}$$

We are now going to combine this expression with a construction relating states ω ∈ *S*(*A*) to arrows from O(Σ(*A*)) to the truth object Ω in T(*A*). This construction generalizes the fundamental bijective correspondence between states on commutative (unital) C\*-algebras *A* and probability measures on its Gelfand spectrum Σ(*A*) (cf. Theorem B.24) to the non-commutative case.

To this end, we first need to replace probability measures on *spaces* by probability measures on *locales*. This, in turn, requires the *lower real numbers* R*l*, which may be identified with proper subsets *xl* ⊂ Q with the following two properties:


In Sets, the lower reals may be identified with R (in Hilbert's definition) by identifying *xl* with its supremum *x* = sup*xl*, but in arbitrary toposes (that admit internal natural and hence rational numbers) they drift apart. Similarly, one defines the *upper real numbers* R*<sup>u</sup>* as proper upper subsets *xu* ⊂ Q such that *p* ∈ *xu* implies that there exists *q* ∈ *xu* with *p* > *q*; once again, in Sets, R*<sup>u</sup>* may be identified with Hilbert's R by taking *x* = inf *xu*. The *Dedekind real numbers* R*d*, then, are pairs (*xl*, *xu*) where *xl* ∈ R*<sup>l</sup>* and *xu* ∈ R*<sup>u</sup>* are such that *xl* ∩*xu* = 0 and for each / *p*,*q* ∈ Q with *p* < *q*, either *p* ∈ *xl* or *q* ∈ *xu*. In Sets these may be identified with sup*xl* = inf *xu* = *x*, so that R*<sup>d</sup>* ∼= R, but in many toposes R*l*, R*u*, and R*<sup>d</sup>* are all different. For example, we have already seen that in sheaf toposes Sh(*X*), the Dedekind reals are given by the sheaf (E.150), but the lower reals turn out to be defined by

$$(\mathbb{R}\_l)\_0: U \mapsto L(U, \mathbb{R}), \tag{12.198}$$

where *U* ∈ O(*X*) and *L*(*U*,R) is the set of all lower semicontinuous functions from *U* to R that are locally bounded from above (and similarly for R*u*, *mutatis mutandis*). In particular, in T(*A*) we have the functor

$$(\underline{\mathbb{R}}\_l)\_0 : \mathbb{C} \mapsto L(\uparrow \mathcal{C}, \mathbb{R}), \tag{12.199}$$

which is quite different from (12.3) (and similarly for R*u*).

#### Definition 12.25. *A* probability measure on a locale *X is a monotone map*

$$
\mu: \mathcal{O}(X) \to [0, 1]\_l,\tag{12.200}
$$

*where* [0,1]*<sup>l</sup> is the collection of lower reals between 0 and 1 (defined by replacing* Q *in the definition of* R*<sup>l</sup> by the set of all rationals* 0 ≤ *q* ≤ 1*), that satisfies*

$$
\mu\left(\top\right) = 1;\tag{12.201}
$$

$$
\mu(U) + \mu(V) = \mu(U \wedge V) + \mu(U \vee V); \tag{12.202}
$$

$$
\mu\left(\bigvee\_{\lambda}U\_{\lambda}\right) = \bigvee\_{\lambda}\mu(U\_{\lambda}),\tag{12.203}
$$

*for any directed family* (*U*<sup>λ</sup> ) *in* O(*X*)*.*

Compared with (probability) measures on σ-algebras, we see that (probability) measures on locales are merely defined on *open* sets (as opposed to *measurable* sets, which include opens), but this weakening is compensated for by the much stronger (i.e. uncountable) additivity axiom (12.203). Indeed, in Sets, if *X* is a compact Hausdorff space, one even has a bijective correspondence between *regular* probability measures μ on *X* as a space and probability measures μ on *X* as a locale.

This definition makes sense in constructive mathematics, and hence it may be internalized to T(*A*). Doing so, probability measures on the internal Gelfand spectrum Σ(*A*) turn out to correspond to the following notion (cf. Definition 2.26).

Definition 12.26. *A* quasi-state *on a unital C\*-algebra A is a map* ω : *A* → C *that is positive and normalized (*ω(1*A*) = 1*), satisfies* ω(*b*+*ic*) = ω(*b*) +*i*ω(*c*) *for b*<sup>∗</sup> = *b and c*∗ = *c, and is linear on each commutative unital C\*-algebra in A.*

Theorem 12.27. *There is a bijective correspondence between quasi-states* ω *on A and probability measures* μω *on the internal Gelfand spectrum* Σ(*A*)*.*

The proof uses the fact that given the (Alexandrov) topology on C (*A*), a function ↑*C* → [0,1] is lower semicontinuous iff it is order-preserving (i.e., monotone); since [0,1] is bounded, the condition of local boundedness is trivially satisfied and hence *L*(↑*C*,[0,1]) consists of all order-preserving functions from ↑*C* ⊂ C (*A*) to [0,1].

*Proof.* Any probability measure on Σ(*A*) is a natural transformation

$$
\underline{\mu}: \underline{\Sigma}(\underline{\mathsf{A}}) \to \underline{[0,1]}\_l,\tag{12.204}
$$

whose component at *C* ∈ C (*A*), according to (12.138) and (12.199), is a map

$$
\underline{\mu}\_{\mathcal{C}} \colon \mathcal{O}(\Sigma^{A}\_{\uparrow C}) \to L(\uparrow C, [0, 1]), \tag{12.205}
$$

satisfying properties dictated by Definition 12.25. In particular, if *C* is maximal abelian in *A*, then by the comment preceding the proof, μ*<sup>C</sup>* is simply a function O(Σ(*C*)) → [0,1] that satisfies (12.201) - (12.203) and hence is a (regular) probability measure μ*<sup>C</sup>* on Σ(*C*). Thus by Riesz–Markov one obtains a state ω*<sup>C</sup>* on each maximal abelian *C*. From the topology on Σ*<sup>A</sup>* and (12.137) we see that if *D* is not maximal, μ*<sup>D</sup>* is determined by μ*<sup>C</sup>* for any *C* ⊃ *D*, so that we also obtain a probability measure μ*<sup>D</sup>* on Σ(*D*), or, equivalently, a state ω*D*, by restriction of ω*<sup>C</sup>* to *D*. One might fear that μ*<sup>D</sup>* and ω*<sup>D</sup>* could depend on the chosen embedding *D* ⊂ *C*, but naturality of μ implies that if *D* ⊂ *C* as well as *D* ⊂ *C* , where both *C* and *C* are maximal, then the ensuing measures μ*<sup>D</sup>* are the same. This implies the same property for the corresponding states ω*D*, which in turn shows that the collection of all μ*<sup>D</sup>* and μ*<sup>C</sup>* thus obtained organizes itself into a single quasi-state ω on *A*.

The converse follows by running this argument backwards. -

Combining (12.196) with Theorem 12.27, we obtain a state-proposition pairing that is no longer probabilistic, as in ordinary quantum mechanics, but defines a proposition in the internal language of T(*A*) and as such may or may not be true at each stage *C* ∈ C (*A*). The final ingredient for this is an arrow

$$
\underline{1}: \underline{\Sigma}(\underline{A}) \to \underline{[0,1]}\_l,\tag{12.206}
$$

defined by its components 1*<sup>C</sup>* : O(Σ*<sup>A</sup>* <sup>↑</sup>*C*) <sup>→</sup> *<sup>L</sup>*(↑*C*,[0,1]) that map each open subset of Σ*A* <sup>↑</sup>*<sup>C</sup>* to the constant function on <sup>↑</sup>*<sup>C</sup>* taking the value 1 <sup>∈</sup> [0,1]. The internal language of T(*A*) (cf. §E.5) turns this into a formula μω = 1 with the following interpretation:

$$[ [\underline{\mu}\_o = \underline{1}] ] : \underline{\Sigma}(\underline{A}) \to \underline{\Omega} . \tag{12.207}$$

We combine this with (12.196) so as to obtain an internal state-proposition pairing

$$[[\underline{\mu}\_{ab}(\underline{a\in}(p,q)) = \underline{1}]]\_{\mathrm{QT}} : \underline{\mathbf{1}} \to \underline{\mathbf{2}},\tag{12.208}$$

where we have abbreviated

$$[[\underline{\mu}\_{ao}(\underline{a\in}(p,q)) = \underline{1}]]\_{\mathrm{QT}} = [[\underline{\mu}\_{ao} = \underline{1}]] \circ [[\underline{a\in}(p,q)]]\_{\mathrm{QT}}.\tag{12.209}$$

The truth of the proposition (12.208) at stage *C* may be determined from Kripke– Joyal semantics; a straightforward computation for *A* = *B*(*H*) shows that

$$C \Vdash \underline{\mu}\_{ao}(\underline{a \in (p, q)}) = \underline{1} \tag{12.210}$$

iff there exists a projection *e* ∈ P(*C*) with *e* ≤ *e* (*a*) (*p*,*q*) and ω(*e*) = 1. Assuming ω is a vector state ω(*a*) = ψ,*a*ψ for some unit vector ψ ∈ *H*, this means that (12.210) holds iff ψ ∈ *eH* ⊆ *e* (*a*) (*p*,*q*) *H* for some *e* ∈ P(*C*), i.e., if the proposition *a* ∈ (*p*,*q*) has (Born) probability one in state ψ *and* there is a yes-no measurement in context *C* verifying this probability. In comparison, in classical mechanics a pure state *x* ∈ *X* makes *a* ∈ (*p*,*q*) true iff *a*(*x*) ∈ (*p*,*q*), where *a* ∈ *C*(*X*,R) as before.

We close this chapter with a topos-theoretical (or, one might say, topological) reinterpretation of the Kochen–Specker Theorem, which to some extent explains why the previous construction had to use the fuzzy interval domain IR rather than the sharp reals R. To this end, we first generalize the notion of a quasi-linear noncontextual hidden variable (cf. Definitions 6.1 and 6.3) to any (unital) C\*-algebra:

Definition 12.28. *1. A* valuation *on a unital C\*-algebra A is a unital map*

$$V: A\_{\rm sa} \to \mathbb{R} \tag{12.211}$$

*that is dispersion-free (i.e. multiplicative) and linear on commuting operators. 2. A* point *in a frame* O(*X*) *in some topos* T *is defined as a frame homomorphism*

$$p: \mathcal{O}(X) \to \mathfrak{Q},\tag{12.212}$$

*where* Ω *is the truth object in* T*.*

If *A* is commutative, the Gelfand spectrum Σ(*A*) consists of the valuations on *A*. The second part generalizes the notion of a point of a frame in set theory (cf. §C.11).

Theorem 12.29. *For any unital C\*-algebra A, there are canonical bijective correspondences between:*


*Proof.* We first give the external description of points of a locale *Y* in a sheaf topos Sh(*X*) (cf. §E.4). The subobject classifier in Sh(*X*) is the sheaf Ω : *U* → O(*U*), in terms of which a point of *Y* is a frame map O(*Y*) → Ω. Externally, the pointfree space defined by the frame Ω is given by the identity map id*<sup>X</sup>* : *X* → *X*, so that a point of *Y* externally correspond to a continuous cross-section σ : *X* → *Y* of the bundle π : *Y* → *X* (i.e., π ◦ σ = id*<sup>X</sup>* ). In principle, π and σ are by definition frame maps in the opposite direction, but in the case at hand, namely *X* = C (*A*) and *<sup>Y</sup>* <sup>=</sup> <sup>Σ</sup>*A*, the map <sup>σ</sup> : <sup>C</sup> (*A*) <sup>→</sup> <sup>Σ</sup>*<sup>A</sup>* may be interpreted as a continuous cross-section of the projection (12.134) in the usual sense. Being a cross-section simply means that σ(*C*) ∈ Σ(*C*). As to continuity, by definition of the Alexandrov topology, σ is continuous iff the following condition is satisfied:

For all <sup>U</sup> <sup>∈</sup> <sup>O</sup>(Σ*A*) and all *<sup>C</sup>* <sup>⊆</sup> *<sup>D</sup>*, if <sup>σ</sup>(*C*) <sup>∈</sup> <sup>U</sup> , then <sup>σ</sup>(*D*) <sup>∈</sup> <sup>U</sup> .

Hence, given the definition of O(Σ*A*), the following condition is sufficient for continuity: if *C* ⊆ *D*, then σ(*D*)|*<sup>C</sup>* = σ(*C*). However, this condition is also necessary. To explain this, let ρ*DC* : Σ(*D*) → Σ(*C*) again be the restriction map. This map is continuous and open. Suppose ρ*DC*(σ(*D*)) = σ(*C*). Since Σ(*D*) is Hausdorff, there is an open neighbourhood U*<sup>D</sup>* of ρ−<sup>1</sup> *DC*(σ(*C*)) not containing σ(*D*). Let U*<sup>C</sup>* = ρ*DC*(U*D*) and take any <sup>U</sup> <sup>∈</sup> <sup>O</sup>(Σ*A*) such that <sup>U</sup> <sup>∩</sup>O(Σ(*C*)) = <sup>U</sup>*<sup>C</sup>* and <sup>U</sup> <sup>∩</sup>O(Σ(*D*)) = <sup>U</sup>*D*. This is possible, since U*<sup>C</sup>* and U*<sup>D</sup>* satisfy both conditions in the definition of O(Σ*A*). By construction, σ(*C*) ∈ U but σ(*D*) ∈/ U , so that σ is not continuous. Hence σ is a continuous cross-section of π iff

$$
\sigma(D)\_{|C} = \sigma(C) \text{ for all } C \subseteq D. \tag{12.213}
$$

Now define a map *V* : *A*sa → C by *V*(*a*) = σ(*C*∗(*a*))(*a*), where *C*∗(*a*) is the commutative unital C\*-algebra generated by *a*. If *b*∗ = *b* and [*a*,*b*] = 0, then *V*(*a*+*b*) = *V*(*a*)+*V*(*b*) by (12.213), applied to*C*∗(*a*) ⊂*C*∗(*a*,*b*) as well as to*C*∗(*b*) ⊂*C*∗(*a*,*b*). Furthermore, since σ(*C*) ∈ Σ(*C*), the map *V* is dispersion-free.

Conversely, a valuation *V* defines a cross-section σ by complex linear extension of σ(*C*)(*a*) = *V*(*a*), where *a* ∈ *C*sa. By the criterion (12.213) this cross-section is continuous, since the value *V*(*a*) is independent of the choice of *C* containing *a*. -

Corollary 12.30. *The bundle* <sup>π</sup> : <sup>Σ</sup>*<sup>A</sup>* <sup>→</sup> <sup>C</sup> (*A*) *(cf. Corollary 12.22) admits no continuous cross-sections as soon as A has no valuations (e.g. if A* = *Mn*(C)*, n* > 2*).*

The contrast between the pointlessness of the internal spectrum Σ and the spatiality of the external spectrum Σ*<sup>A</sup>* is striking, but easily explained: a point of Σ*<sup>A</sup>* (in the usual sense, but also in the frame-theoretic sense if Σ*<sup>A</sup>* is sober) necessarily lies in some <sup>Σ</sup>(*C*) <sup>⊂</sup> <sup>Σ</sup>*A*, and hence is defined (and dispersion-free) only in the context *C*. For example, for *A* = *Mn*(C), a point *V* ∈ Σ(*C*) corresponds to a map

$$V^\* : \mathcal{O}(\Sigma^A) \to \{0, 1\}, \ S \mapsto V(S(\mathcal{C})), \tag{12.214}$$

where O(Σ*A*) is given by (12.95). Thus *V*<sup>∗</sup> is only sensitive to the value of *S* at *C*.

Notes 493

#### Notes

Previous advocates of intuitionistic logic for quantum mechanics include Popper (1968) and Coecke (2002). The earliest use of topos theory in quantum mechanics was probably by Adelman & Corbett (1995), but the founding papers of the topos approach to quantum mechanics as further developed in this chapter are Isham & Butterfield (1998), Butterfield & Isham (1999, 2002), and Hamilton, Isham & Butterfield (2000). This series of papers was predated by Isham (1997) and was followed by Doring & Isham (2008abcd, 2010); see also Flori (2013) for an intro- ¨ duction. Wolters (2013ab) gives a detailed comparison between the "contravariant" Butterfield–Doring–Isham approach and the "covariant" approach in this chapter. ¨

The original motivation behind our approach to "quantum toposophy" was the *Principle of General Tovariance* (Heunen, Landsman, & Spitters, 2008), which was a pun on Einstein's *Principle of General Covariance* underlying General Relativity (Norton, 1993, 1995). Einstein based his theory of gravity and space-time on the mathematical postulate that all equations of physics be invariant under arbitrary coordinate transformation, and similarly we proposed that all physical theories should be invariant under so-called *geometric morphisms* between toposes and hence should be formulated in terms of what (confusingly) is called *geometric logic* (cf. Mac Lane & Moerdijk, 1992; Johnstone, 2002). Since in fact some of our constructions turned out be non-geometric in this sense, we subsequently dropped this principle and stopped even referring to the above paper. However, as Raynaud (2014) and, more generally, Henry (2015) show, our theory can actually be made geometric (in the topos-theoretical sense) provided one puts the entire theory of (internal) C\*-algebras on a localic (i.e., pointfree) basis, as in Henry (2014ab). Other recent developments of the program (which are not discussed here) may be found in e.g. van den Berg & Heunen (2012, 2014), Spitters, Vickers, & Wolters (2014), Heunen (2014ab), and Heunen & Lindenhovius (2015).

#### §12.1. C\*-algebras in a topos

C\*-algebras in a topos, including a constructive version of Gelfand duality for commutative unital C\*-algebras that is valid in arbitrary Grothendieck toposes, were first studied by Banaschewski & Mulvey (2000ab, 2006). The topos T(*A*) and the internal commutative C\*-algebra *A* were introduced by Heunen, Landsman, & Spitters (2009). All these papers rely crucially on the theory of internal locales in toposes, which owes much to Johnstone (1982) and Joyal & Tierney (1984). See also Johnstone (1983) and Vickers (2007). It is possible to realize T(*A*) as the topos of sheaves on the locale Idl(C (*A*)), which is the ideal completion of the "mere" poset C (*A*), but we will not use this description (Raynaud, 2014).

#### §12.2. The Gelfand spectrum in constructive mathematics

This section is based on Coquand (2005) and Coquand & Spitters (2005, 2009), where also the missing details may be found. All necessary background on lattice theory is provided by Johnstone (1982), except the ingredients for the proof that the constructive Gelfand spectrum is compact and regular, which is due to Cederquist & Coquand (2000). Proposition 12.10 may be found in Aczel (2006).

#### §12.3. Internal Gelfand spectrum and intuitionistic quantum logic

This section is based on Caspers, Heunen, Landsman, & Spitters (2009), except for the final part on Kripke semantics, which is taken from Heunen, Landsman, & Spitters (2012). An interesting philosophical analysis of the intuitionistic logic emerging from this program may be found in Hermens (2016), to whom the interpretation elements of the frame O(Σ*A*) as disjunctions is due.

#### §12.4. Internal Gelfand spectrum for arbitrary C\*-algebras

This section is based on Caspers (2008), Caspers, Heunen, Landsman, & Spitters (2009), and Heunen, Landsman, & Spitters (2009). Complete proofs of Lemma 12.15 and Lemma 12.16 may be found in Caspers (2008), §5.2. For different proofs of these lemmas see Heunen, Landsman, & Spitters (2009) and Coquand (2005), respectively. A proof of Lemma 12.21 may be found in Wolters (2013b), Theorem 2.17, also available as http://arxiv.org/pdf/1010.2031v2.pdf.

#### §12.5. "Daseinisation" and Kochen–Specker Theorem

The spectral order was introduced by Olson (1971) and was rediscovered by De Groote (2011). For a devastating critique of Heidegger's philosophy see Philipse (1999). The first construction of a "Daseinisation" map was given by Doring & ¨ Isham (2008b). The version presented here is an improvement, due to Wolters (2013ab), of a previous adaptation of the Doring–Isham appraoch to the topos ¨ T(*A*) in Heunen, Landsman, & Spitters (2009). Similarly, Theorem 12.29, first published in Heunen, Landsman, Spitters, & Wolters (2012), is an improvement due to Wolters (2013a) of an earlier result in this direction in Heunen, Landsman, & Spitters (2009).

The work of Isham & Butterfield (1998), which, as already mentioned, started the entire quantum toposophy program, was actually motivated by an topos-theoretica reformulation of the Kochen–Specker Theorem. Isham and Butterfield started from the following observation. Let C (*B*(*H*)) be the poset of commutative *von Neumann* subalgebras of *B*(*H*), partially ordered by set-theoretic inclusion, seen as a category in the usual way. Consider the presheaf topos [C (*H*)op,Set] of *contravariant* functors *F* : C (*H*) → Set, where Set is the category of sets. The *spectral presheaf* is the contravariant functor Σ defined on objects by Σ0(*C*) = Σ(*C*), and by the natural map on arrows, that is, Σ1(*C* ⊂ *D*) maps ω ∈ Σ(*D*) (which is a map *D* → C) to its restriction to *<sup>C</sup>*, i.e., to <sup>ω</sup>|*<sup>C</sup>* <sup>∈</sup> <sup>Σ</sup>(*C*). A *point* of some object *<sup>F</sup>* in [<sup>C</sup> (*B*(*H*))op,Set] is defined as a natural transformation 1 → *F*, where 1 is the terminal object, i.e., the presheaf that maps everything into the singleton set ∗.

The Kochen–Specker Theorem a la Butterfield & Isham, then, states that if ` dim(*H*) > 2 as usual, *the spectral presheaf has no points*.

## Appendix A Finite-dimensional Hilbert spaces

Although we assume the reader to be familiar with linear algebra, some of the points below may not be emphasized at that level and hence need to be recalled.

Unless explicitly stated otherwise, all vector spaces (and hence also all algebras) are defined over the *complex numbers* C. Moreover, from §A.2 until the end of this appendix, *V* will be *finite-dimensional*; the infinite-dimensional case will be treated in the next appendix on functional analysis and general Hilbert spaces.

#### A.1 Basic definitions

Definition A.1. *Let V be a vector space (not necessarily finite-dimensional).*

	- *a. v*+*w*≤*v*+*w (*triangle inequality*);*
	- *b.* λ*v* = |λ|*v (*homogeneity*);*
	- *c. v* = 0 *iff v* = 0 *(*positive definiteness*).*

Many analytical arguments in functional analysis are based on the fundamental *Cauchy–Schwarz inequality*, which is satisfied by any (pre-) inner product:

$$\left| \langle \boldsymbol{\upsilon}, \boldsymbol{w} \rangle \right|^2 \leq \langle \boldsymbol{\upsilon}, \boldsymbol{\upsilon} \rangle \langle \boldsymbol{w}, \boldsymbol{w} \rangle. \tag{A.1}$$

Proposition A.2. *An inner product on V defines a norm on V by means of*

$$\|\nu\| = \sqrt{\langle \nu, \nu \rangle}. \tag{A.2}$$

495

*The Cauchy–Schwarz inequality* (A.1) *then reads*

$$|\langle \boldsymbol{\nu}, \boldsymbol{w} \rangle| \le ||\boldsymbol{\nu}|| \, ||\boldsymbol{w}||,\tag{A.3}$$

*with equality iff v and w are linearly dependent.*

The question arises when a norm comes from an inner product via (A.2).

Theorem A.3. *A norm* · *comes from an inner product through* (A.2) *iff*

$$\|\nu + \nu\|^2 + \|\nu - \nu\|^2 = \mathcal{Z}(\|\nu\|^2 + \|\nu\|^2). \tag{A.4}$$

*In that case, one has the* polarization identity

$$
\langle \boldsymbol{\nu}, \boldsymbol{\nu} \rangle = \frac{1}{4} (\left\| \boldsymbol{\nu} + \boldsymbol{\nu} \right\|^2 - \left\| \boldsymbol{\nu} - \boldsymbol{\nu} \right\|^2 + i \left\| \boldsymbol{\nu} - i \boldsymbol{\nu} \right\|^2 - i \left\| \boldsymbol{\nu} + i \boldsymbol{\nu} \right\|^2). \tag{A.5}$$

*Proof.* Easy computations show that (A.2) holds, that *w*, *v* = *v*,*w*, and, with a bit more effort, that *v*,*w*<sup>1</sup> +*w*2 = *v*,*w*1+*v*,*w*2. Now suppose we know that

$$
\langle \mathbf{w}, \mathbf{s} \mathbf{v} \rangle = \mathbf{s} \langle \mathbf{w}, \mathbf{v} \rangle \tag{A.6}
$$

for certain *<sup>s</sup>* <sup>∈</sup> <sup>R</sup>. Then this property clearly also holds for *<sup>s</sup>*−<sup>1</sup> instead of *<sup>s</sup>*. Furthermore, having (A.6) for *s* as well as *t* ∈ R implies the same property also for *s* +*t* and *st*. Starting with *s* = *t* = 1, this generates (A.6) for each *s* ∈ Q. Now if *sn* → *s* for *sn* ∈ Q and *s* ∈ R, then by continuity and homogeneity of the norm, *w*,*snv*→*w*,*sv*. Consequently, (A.6) holds for each *s* ∈ R. Finally, from (A.5) we also find *w*,*iv* = *iw*, *v*, and hence (A.6) holds for each *s* ∈ C. -

There is an analogous result for continuous hermitian forms, with practically the same proof (where continuity is once again needed to pass from Q to R). Let *V* be a vector space with inner product, and let *B* : *V* ×*V* → C be a hermitian form. The associated *quadratic form Q* : *V* → R, defined by

$$\mathcal{Q}(\nu) = B(\nu, \nu), \tag{A.7}$$

then satisfies

$$\mathcal{Q}(z\nu) = |z|^2 \mathcal{Q}(\nu) \text{ (}z \in \mathbb{C}\text{)};\tag{A.8}$$

$$\mathcal{Q}(\nu+\nu) + \mathcal{Q}(\nu-\nu) = \mathcal{Q}(\mathcal{Q}(\nu) + \mathcal{Q}(\nu)). \tag{A.9}$$

Proposition A.4. *Let V be a vector space with inner product. A map Q* : *V* → R *that is continuous in the associated norm* (A.2) *is derived from a hermitian form B* : *H* ×*H* → C *through* (A.7) *iff Q satisfies* (A.8) *-* (A.9)*, in which case*

$$B(\mathbf{v}, \mathbf{w}) = \frac{1}{4}(\mathcal{Q}(\mathbf{v} + \mathbf{w}) - \mathcal{Q}(\mathbf{v} - \mathbf{w}) + i\mathcal{Q}(\mathbf{v} - i\mathbf{w}) - i\mathcal{Q}(\mathbf{v} + i\mathbf{w})).\tag{A.10}$$

#### A.2 Functionals and the adjoint

*In the remainder of this appendix, V is a finite-dimensional complex vector space with inner product*. Since this is automatically a (finite-dimensional) *Hilbert space* (as defined in the next appendix), we rename it as *H*. The archetypal example is *<sup>H</sup>* <sup>=</sup> <sup>C</sup>*n*, with elements *<sup>z</sup>* = (*z*1,...,*zn*), *zi* <sup>∈</sup> <sup>C</sup>*n*, and standard inner product

$$
\langle z, w \rangle = \sum\_{i=1}^{n} \overline{z}\_i w\_i. \tag{A.11}
$$

In that case, we hardly make a difference between a linear map *a* : *H* → *H* and the corresponding matrix (*ai j*), where (*az*)*<sup>i</sup>* = ∑*<sup>j</sup> ai jzj*, or, equivalently,

$$a\_{i\bar{j}} = \langle \mathfrak{v}\_i, a\mathfrak{v}\_j \rangle,\tag{A.12}$$

where (υ<sup>1</sup> = (1,0,...,0),...υ*<sup>n</sup>* = (0,...,0,1)) is the standard basis of C*n*. More generally, we will only consider *orthonormal bases* of Hilbert spaces *H*, i.e., bases (υ*i*) for which υ*i*,υ*j* = δ*i j*. In fact, in the present (finite-dimensional) case, any orthonormal set of *n* = dim(*H*) vectors is automatically a basis. *Throughout this book, the word "basis" will be synonymous with* orthonormal *basis.*

Let *H*<sup>∗</sup> be the vector space of linear maps *f* : *H* → C, also called (linear) *functionals* (on *H*). Since the inner product is positive definite, it is also non-degenerate:

Proposition A.5. *The map* ψ → *f*ψ*, where*

$$f\_{\Psi}(\mathfrak{q}) = \langle \Psi, \mathfrak{q} \rangle,\tag{A.13}$$

*is an anti-linear isomorphism H* → *H*<sup>∗</sup> *(i.e., one has* λψ → λ *f*<sup>ψ</sup> *for any* λ ∈ C*).*

*Proof.* Injectivity is obvious. For surjectivity, note that coker(*f*) (i.e., the orthogonal complement of the kernel ker(*f*) of *f*) is one-dimensional (assuming *f* is nonzero), and take a unit vector ψ˜ ∈ coker(*f*). Then ψ = *f*(ψ˜)ψ˜ does the job: by linearity of *f* , we have *f*(ϕ)ψ˜ − *f*(ψ˜)ϕ ∈ ker(*f*) for any ϕ ∈ *H* (and even any ψ˜ ∈ *H*), so that ψ˜ , *<sup>f</sup>*(ϕ)ψ˜ <sup>−</sup> *<sup>f</sup>*(ψ˜)ϕ <sup>=</sup> 0. Since ψ˜ ,ψ˜ <sup>=</sup> ψ˜ <sup>2</sup> <sup>=</sup> 1, this yields *<sup>f</sup>* <sup>=</sup> *<sup>f</sup>*ψ. -

A linear map *a* : *H* → *H* is also called an *operator*; we denote the algebra of all operators on *H* by *B*(*H*). For example, we have *B*(C*n*) ∼= *Mn*(C). Two arbitrary vectors ψ,ϕ ∈ *H* define an operator |ψϕ| through Dirac's "bra-ket" notation

$$|\Psi\rangle\langle\Phi|\mathcal{X} = \langle\Phi,\mathcal{X}\rangle\Psi.\tag{A.14}$$

The *adjoint a*∗ of an operator *a* is defined by the property

$$
\langle a^\* \Psi, \Phi \rangle = \langle \Psi, a\Phi \rangle,\ (\Psi, \Phi \in H). \tag{A.15}
$$

Indeed, for given χ (and *a*), define a functional *fa*,<sup>χ</sup> : *H* → C by *fa*,<sup>χ</sup> (ϕ) = χ,*a*ϕ. Then, as we just saw, *fa*,<sup>χ</sup> = *f*<sup>ψ</sup> for some unique ψ ∈ *H*; define *a*<sup>∗</sup> by *a*∗χ = ψ. This map is linear by construction.

Clearly, one has

$$a^{\*\*} = a.\tag{A.16}$$

For *H* = C*n*, the matrix corresponding to the adjoint *a*<sup>∗</sup> is given by the well-known formula *a*∗ *i j* = *aji*. A more abstract example of an adjoint is given by

$$|\Psi\rangle\langle\Phi|^\* = |\Phi\rangle\langle\Psi|.\tag{A.17}$$

The *(operator) norm* of *a* : *H* → *H* is defined by

$$\|\|a\|\| = \sup\{\|\|a\Psi\|\|, \Psi \in H\_1\}.\tag{A.18}$$

where the *unit sphere H*<sup>1</sup> ⊂ *H* is defined by

$$H\_{\mathbb{I}} = \{ \Psi \in H, \|\Psi\| = 1 \}. \tag{A.19}$$

Proposition A.6. *One has a* < ∞ *for any linear map a* : *H* → *H.*

*Proof.* Recall that dim(*H*) = *n* < ∞! Map *H* to C*<sup>n</sup>* by the choice of some basis (υ*i*). Thus <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>* is mapped to <sup>ψ</sup>˜ = (ψ1,...,ψ*n*) <sup>∈</sup> <sup>C</sup>*n*, with <sup>ψ</sup>*<sup>i</sup>* <sup>=</sup> υ*i*,ψ, and we have ψ˜ <sup>2</sup> <sup>=</sup> ψ, where *z*<sup>2</sup> <sup>2</sup> = ∑*<sup>i</sup>* |*zi*| <sup>2</sup> is the usual norm on C*n*, which is given by (A.2) with (A.11). This also transfers the operator *a* : *H* → *H* to a linear map *<sup>a</sup>*˜ : <sup>C</sup>*<sup>n</sup>* <sup>→</sup> <sup>C</sup>*<sup>n</sup>* defined by the matrix (A.12). Then *a* <sup>=</sup> *a*˜ <sup>=</sup> sup{*az*˜ 2,*<sup>z</sup>* <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* 1}, where C*<sup>n</sup>* <sup>1</sup> <sup>=</sup> {*<sup>z</sup>* <sup>∈</sup> <sup>C</sup>*n*,*z*<sup>2</sup> <sup>=</sup> <sup>1</sup>}. Now ˜*<sup>a</sup>* is continuous because it is linear, and hence it maps C*<sup>n</sup>* <sup>1</sup> (which is compact by Heine–Borel) to some compact set ˜*a*(C*<sup>n</sup>* <sup>1</sup>) in C*n*. It is easy to see that the norm ·<sup>2</sup> : <sup>C</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> is continuous, and according to Weierstrass the norm therefore assumes a finite maximum (as well as a minimum) on any compact set *K*. Taking *K* = *a*˜(C*<sup>n</sup>* <sup>1</sup>) proves the claim. -

Proposition A.7. *Let a*,*b* : *H* → *H be linear maps, and let* ψ ∈ *H. Then:*

$$\|a\Psi\| \le \|a\| \|\Psi\|;\tag{A.20}$$

$$||ab|| \le ||a|| ||b||;\tag{A.21}$$

$$\|a^\*\| = \|a\|;\tag{A.22}$$

$$\left\|\left|a^\*a\right\|\right\| = \left\|a\right\|^2. \tag{A.23}$$

*Proof.* The first two inequalities are immediate from (A.18). Next, if ψ = 1, by (A.3), (A.15), and (A.20) we have

$$\|\|a^\*\Psi\|\|^2 = \langle a^\*\Psi, a^\*\Psi\rangle = \langle \Psi, aa^\*\Psi\rangle \le \|\|\Psi\|\| \|aa^\*\Psi\| \le \|a\| \|\|a^\*\Psi\|,\tag{A.24}$$

so *a*∗ψ≤*a*, and hence from (A.18), *a*∗≤*a*. But (A.16) gives the opposite inequality, whence (A.22). Finally, (A.21) and (A.22) yield *a*∗*a*≤*a*∗*a* = *a*2. From (A.3) and (A.20), on the other hand, we obtain

$$\left| \left| a\Psi \right| \right|^2 = \langle a\Psi, a\Psi \rangle = \langle \Psi a^\* a \Psi \rangle \le \| a^\* a \|,\tag{A.25}$$

so *a*<sup>2</sup> ≤ *a*∗*a* by (A.18), and hence (A.23) is proved. -

#### A.3 Projections

The most important examples (and also, as will see shortly, building blocks) of selfadjoint operators are *projections e* : *H* → *H*, defined by the property

$$e^2 = e^\* = e.\tag{A.26}$$

Proposition A.8. *There is a bijective correspondence e* ↔ *L between:*


*given by*

$$L = eH;\tag{A.27}$$

$$e = \sum\_{i} |\mathfrak{v}\_{i}\rangle\langle\mathfrak{v}\_{i}|,\tag{A.28}$$

*where eH* = {*e*ψ,ψ ∈ *H*} *is the image of e, and* (υ*i*) *is an arbitrary basis of L.*

The proof is routine, including the fact that (A.28) is independent of the basis. Whenever convenient, we write (A.28) as *eL*. For example, the "sub"space *L* = *H* corresponds to *eH* = 1*H*, whereas *L* = {0} corresponds to *e*{0} = 0.

Define the *orthogonal complement!of subset of Hilbert space L*⊥ of *any* subset *L* ⊂ *H* by

$$L^\perp = \{ \Psi \in H \mid \langle \Psi, \Phi \rangle = 0 \,\forall \, \Phi \in L \}. \tag{A.29}$$

In particular, if *L* is a *linear* subspace of *H*, one easily checks that

$$e\_{L^{\perp}} = 1 - e\_{L^{\perp}} \tag{A.30}$$

Corollary A.9. *For each linear subspace L* ⊂ *H one has*

$$H = L \oplus L^{\perp},\tag{A.31}$$

*in the sense that L*∩*L*<sup>⊥</sup> = {0}*, and each vector* ψ ∈ *H has a* unique *decomposition*

$$
\Psi = \Psi^{\parallel} + \Psi^{\perp},
\tag{A.32}
$$

*where* ψ ∈ *L and* ψ<sup>⊥</sup> ∈ *L*⊥*.*

*Proof.* Existence of the decomposition is given by

$$
\Psi^\parallel = e\_L \Psi;\tag{A.33}
$$

$$
\Psi^{\perp} = (1 - e\_L)\Psi. \tag{A.34}
$$

Uniqueness follows by assuming ψ = χ + χ<sup>⊥</sup> with χ ∈ *L* and χ<sup>⊥</sup> ∈ *L*⊥: one then has ψ − χ = ψ<sup>⊥</sup> − χ⊥, but since the left-hand side is in *L* and the right-hand side is in *L*⊥, both sides lie in *L*∩*L*<sup>⊥</sup> = 0. -

#### A.4 Spectral theory

An *eigenvector* of an operator *a* is a nonzero element ψ ∈ *H* such that

$$
a\Psi = \lambda\Psi\tag{A.35}$$

for some λ ∈ C, called an *eigenvalue* of *a*. We also define the *eigenspace H*<sup>λ</sup> by

$$H\_{\mathbb{A}} = \{ \Psi \in H \mid a\Psi = \mathbb{A}\Psi \},\tag{A.36}$$

with associated projection *e*<sup>λ</sup> (in that *H*<sup>λ</sup> = *e*λ*H*, cf. Proposition A.8). In case that dim(*H*<sup>λ</sup> ) = 1 the eigenvalue λ is called *non-degenerate* (or *simple*). Otherwise it is said to be *degenerate*, with *multiplicity m*<sup>λ</sup> = dim(*H*<sup>λ</sup> ). In linear algebra, the set of all eigenvalues of *a* is called the *spectrum* of *a*, denoted by σ(*a*) (for infinitedimensional *H*, this turns out to be the wrong definition of the spectrum, see §B.14).

We now give two formulations of the *spectral theorem for self-adjoint operators*.

Theorem A.10. *Let a be a self-adjoint operator on H. Then* σ(*a*) ⊂ R*, eigenspaces for different eigenvectors* λ = μ *are orthogonal (i.e., e*<sup>λ</sup> *e*<sup>μ</sup> = δλ μ *e*<sup>λ</sup> *), and*

$$a = \sum\_{\lambda \in \sigma(a)} \lambda \cdot e\_{\lambda};\tag{A.37}$$

$$1\_H = \sum\_{\lambda \in \sigma(a)} e\_{\lambda}.\tag{A.38}$$

*Equivalently, we may reformulate the above* spectral resolution *of a in terms of the existence of a basis* (υ*i*) *of H consisting of eigenvectors of a. In that case, we have*

$$a = \sum\_{i=1}^{\dim(H)} \lambda\_i |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i|;\tag{A.39}$$

$$1\_H = \sum\_{i=1}^{\dim(H)} |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i|,\tag{A.40}$$

*where* λ*<sup>i</sup> is the eigenvalue corresponding to the eigenvector* υ*<sup>i</sup> (i.e., a*υ*<sup>i</sup>* = λ*i*υ*i).*

Note that the eigenvalues λ occurring in (A.37) are all different, whereas the λ*<sup>i</sup>* in (A.40) need not be: the number of times an eigenvalue λ*<sup>i</sup>* ∈ σ(*a*) occurs is given by its multiplicity. This also implies that the spectral resolution (A.37) - (A.38) is canonical (i.e. free of any choices), whereas (A.39) - (A.40) depends on arbitrary choices of bases in all subspaces *H*<sup>λ</sup> with dimension greater than one. Nonetheless, it is easier to prove (A.39) - (A.40), which obviously imply (A.37) - (A.38): just collect all λ*<sup>i</sup>* that are equal to λ and realize that, as in (A.28), one has

$$e\_{\mathbb{A}} = \sum\_{i \mid \mathbb{A} \vdash = \mathbb{A}} |\mathfrak{v}\_{i}\rangle\langle\mathfrak{v}\_{i}|.\tag{A.41}$$

More generally, for some (at the moment) arbitrary (but later: measurable) subset Δ ⊂ R it turns out to be convenient to introduce the *spectral projection e*<sup>Δ</sup> on *H* and the associated *spectral subspace H*<sup>Δ</sup> ⊆ *H*: if Δ ∩σ(*a*) = 0 we put / *e*<sup>Δ</sup> = 0 and *H*<sup>Δ</sup> = {0}, and otherwise,

$$e\_{\Delta} = \sum\_{\lambda \in \Lambda \cap \sigma(a)} e\_{\lambda};\tag{A.42}$$

$$H\_{\Delta} = e\_{\Delta} H.\tag{A.43}$$

We now prepare for the proof of Theorem A.10. First, note from (A.15) that

$$(2i \cdot \text{Im}(\langle \Psi, a\Psi \rangle) = \langle \Psi, a\Psi \rangle - \overline{\langle \Psi, a\Psi \rangle} = \langle \Psi, a\Psi \rangle - \langle \Psi, a^\*\Psi \rangle. \tag{A.44}$$

If *a*<sup>∗</sup> = *a*, from (A.35) and (A.44) one obtains Im(λ) = 0 and hence σ(*a*) ⊂ R.

Lemma A.11. *A self-adjoint operator a has an eigenvalue* λ *for which* |λ| = *a.*

*Proof.* As in the previous proof, the norm · assumes a maximum on the compact set *aH*1, where *H*<sup>1</sup> = {ψ ∈ *H*,ψ = 1}. Suppose this happens at *a*ψ1, where by construction ψ1 = 1. By definition of the norm, this maximum must be *a*, so that *a* = *a*ψ1. Hence, using *a*<sup>∗</sup> = *a*, (A.3), and (A.23), we may estimate

$$\|\|a\|\|^2 = \|a\Psi\_1\|^2 = \langle a\Psi\_1, a\Psi\_1\rangle = \langle \Psi\_1, a^2\Psi\_1\rangle \le \|a^2\Psi\_1\| \le \|a^2\| = \|a\|^2. \tag{A.45}$$

Hence we need equality at the ≤ sign in (A.45), which according to the remark below (A.3) can only be the case if *<sup>a</sup>*2ψ<sup>1</sup> <sup>=</sup> *a*2ψ1. Define <sup>χ</sup><sup>1</sup> <sup>=</sup> *<sup>a</sup>*ψ<sup>1</sup> − *a*ψ1. There are two possibilities: if χ<sup>1</sup> = 0, then *a*ψ<sup>1</sup> = *a*ψ1, and χ<sup>1</sup> = 0, then

$$a\mathfrak{X}\_{\mathsf{l}} = a^2 \mathfrak{Y}\_{\mathsf{l}} - ||a||a\mathfrak{Y}\_{\mathsf{l}} = ||a||^2 \mathfrak{Y}\_{\mathsf{l}} - ||a||a\mathfrak{Y}\_{\mathsf{l}} = -||a||\mathfrak{X}\_{\mathsf{l}}.\tag{A.46}$$

Hence either *a*ψ<sup>1</sup> = *a*ψ<sup>1</sup> or *a*χ<sup>1</sup> = −*a*χ1, which proves the claim. -

We are now in a position to prove Theorem A.10.

*Proof.* By Lemma A.11, we already found one eigenvector υ<sup>1</sup> of *a*, viz. either υ<sup>1</sup> = ψ<sup>1</sup> or υ<sup>1</sup> = χ1. Furthermore, it is easy to show that if a self-adjoint operator *a* leaves a linear subspace *L* ⊂ *H* stable (in that *a*ϕ ∈ *L* whenever ϕ ∈ *L*), then it also leaves *L*<sup>⊥</sup> stable, and remains self-adjoint as an operator *a* : *L*<sup>⊥</sup> → *L*⊥. First use this with *L*<sup>1</sup> = C·υ1. Lemma A.11, now applied to *a* : *L*<sup>⊥</sup> → *L*⊥, gives a second eigenvector υ2. Now take *L*<sup>2</sup> to be the linear span of υ<sup>1</sup> and υ2, and restrict *a* to *L*<sup>⊥</sup> <sup>2</sup> , etc. Since *H* is finite-dimensional, this procedure ends after dim(*H*) steps.

This leaves us with a basis (υ*i*) of *H* that by construction entirely consists of eigenvectors. The mutual orthogonality of these eigenvectors (and hence of the spectral projections *e*<sup>λ</sup> ) follows from a simple calculation. -

Corollary A.12. *The norm of a self-adjoint operator a is given by*

$$\|\|a\|\| = \sup\{ |\lambda|, \lambda \in \sigma(a) \}. \tag{A.47}$$

*Proof.* This rapidly follows from Theorem A.10 by expanding ψ in (A.18) with respect to the basis of *H* given in (A.39) - (A.40). -

Corollary A.13. *A self-adjoint operator a is a projection iff:*


*In particular, if e is a nonzero projection, then*

$$\|e\| = 1.\tag{A.48}$$

*Proof.* Only the third case is nontrivial. If *a* = *e* is a proper projection, then by Corollary A.9 its eigenvectors can only lie in *L* = *eH* (with eigenvalue λ = 1) or in *L*<sup>⊥</sup> = (1−*e*)*H* (with eigenvalue λ = 0). The converse implication follows from Theorem A.10, notably from (A.37). Eq. (A.48) then follows from (A.47). -

A less elementary but more powerful approach to the spectral theorem is as follows. For the notion of a C\*-algebra see Definition C.1 in Appendix C.

Definition A.14. *Let a* ∈ *B*(*H*)*. Then C*∗(*a*) *is the C\*-algebra generated by a and* 1*<sup>H</sup> (i.e., the algebra of all polynomials in a).*

Theorem A.15. *If a is self-adjoint, then C*∗(*a*) *is commutative, and:*

*1. There is an isomorphism of (commutative) C\*-algebras*

$$\mathcal{C}(\sigma(a)) \cong \mathcal{C}^\*(a),\tag{A.49}$$

*written f* → *f*(*a*)*, which is unique if it is subject to the following conditions:*


$$C^\*(a) = C^\*(e\_\lambda, \lambda \in \sigma(a)) = \text{span}(e\_\lambda, \lambda \in \sigma(a)),\tag{A.50}$$

*where the middle term is the C\*-algebra generated by the projections e*<sup>λ</sup> *.*

*3. Under the isomorphism* (A.49)*,*

$$e\_{\lambda} = \delta\_{\lambda}(a),\tag{A.51}$$

*where the delta-function* δλ *on* σ(*a*) *is defined by* δλ : λ → δλ λ *.*

*Proof.* For any complex (finite) polynomial *p*(*x*) = ∑*<sup>n</sup> cnx<sup>n</sup>* on R, define an operator

$$p(a) = \sum\_{n} c\_{n} a^{n}.\tag{A.52}$$

Simple computations then show that, for arbitrary polynomials *p*, and *t* ∈ C,

$$(tp+q)(a) = tp(a) + q(a);\tag{A.53}$$

$$(pq)(a) = p(a)q(a);\tag{A.54}$$

$$p(a)^{\*} = \overline{p}(a). \tag{A.55}$$

Hence the space *P*∗(*a*) of all such polynomials in *a* forms a ∗-algebra of *B*(*H*). As a linear subspace of the finite-dimensional vector space *B*(*H*), *P*∗(*a*) must itself be finite-dimensional, hence it is C\*-algebra. Moreover, *P*∗(*a*) clearly contains 1*<sup>H</sup>* (take *p*(*x*) = 1) as well as *a* (take *p*(*x*) = *x*), and since *P*∗(*a*) ⊆ *C*∗(*a*) by definition of the latter, we must have *P*∗(*a*) = *C*∗(*a*). Since *pq* = *qp* and hence *p*(*a*)*q*(*a*) = *q*(*a*)*p*(*a*) by the above computations, it follows that *P*∗(*a*) and hence *C*∗(*a*) is commutative. This proves the first claim.

To establish the isomorphism (A.49), we are going to define a map

$$C(\sigma(a)) \ni f \mapsto f(a) \in \mathcal{C}^\*(a). \tag{A.56}$$

We initially do this for polynomials *f* = *p*, so that *f*(*a*) = *p*(*a*) is defined by (A.52). Since *C*∗(*a*) = *P*∗(*a*) consists of polynomials in *a*, the map (A.56) is evidently surjective. It is also injective, for suppose *p*(*a*) = *q*(*a*). Applying this to an eigenvector υλ ∈ *H*<sup>λ</sup> yields *p*(λ) = *q*(λ), for each λ ∈ σ(*a*), and hence *p* = *q* as functions on σ(*a*). Hence *f* → *f*(*a*) is, at least, a bijection of sets. Moreover, the properties (A.53) - (A.55) turn it into an isomorphism of C\*-algebras, evidently with the properties stated after (A.49). Finally, for any given function *f* : σ(*a*) → C there exists some polynomial *p* that coincides with *f* on the finite set σ(*a*) ⊂ R, so that we may define *f*(*a*) in (A.56) by *p*(*a*), as in (A.52); by the above proof of injectivity, the ensuing operator *f*(*a*) is independent of the choice of *p*.

We prove the last two claims, using the orthogonality property *e*<sup>λ</sup> *e*<sup>μ</sup> = δλ μ *e*<sup>λ</sup> of *spectral* projections and the defining properties *e*<sup>2</sup> <sup>λ</sup> = *e*<sup>λ</sup> = *e*<sup>λ</sup> of *general* projections, see (A.26). From eq. (A.37) in Theorem A.10 we obtain (for polynomials *f*):

$$f(a) = \sum\_{\lambda \in \sigma(a)} f(\lambda) \cdot e\_{\lambda}. \tag{A.57}$$

If we now define *C*∗(*a*) as the linear span of the spectral projections *e*<sup>λ</sup> and 1*<sup>H</sup>* (which is a unital commutative C\*-algebra by the properties of the *e*<sup>λ</sup> just mentioned), then (A.57) shows that *C*∗(*a*) ⊆ *C*∗(*a*) . Conversely, (A.57) gives (A.51), which shows that *C*∗(*a*) ⊆ *C*∗(*a*), and hence *C*∗(*a*) = *C*∗(*a*). -

A second approach to the final claims of Theorem A.15 is more ambitious, as it includes a derivation of Theorem A.10 (instead of assuming it, as we just did). We now use (A.51) to *define* the spectral projections *e*<sup>λ</sup> ; from (A.54) - (A.55) we have

$$\begin{aligned} e\_\lambda^2 &= \delta\_\lambda(a)^2 = \delta\_\lambda^2(a) = \delta\_\lambda(a) = e\_\lambda; \\ e\_\lambda^\* &= \delta\_\lambda(a)^\* = \overline{\delta\_\lambda}(a) = \delta\_\lambda(a) = e\_\lambda, \end{aligned}$$

showing that *e*<sup>λ</sup> is indeed a projection. Also note the following identities in*C*(σ(*a*)):

504 A Finite-dimensional Hilbert spaces

$$\mathrm{id}\_{\sigma(a)} = \sum\_{\lambda \in \sigma(a)} \lambda \cdot \delta\_{\lambda};\tag{A.58}$$

$$1\_{\sigma(a)} = \sum\_{\lambda \in \sigma(a)} \delta\_{\lambda}. \tag{A.59}$$

Transferring these from *C*(σ(*a*)) to *C*∗(*a*) via the isomorphism (A.49) then yields (A.37) - (A.38). To analyse the projections *e*<sup>λ</sup> defined by (A.51), we first compute

$$e\_{\lambda}e\_{\mu} = \delta\_{\lambda}(a)\delta\_{\mu}(a) = (\delta\_{\lambda}\delta\_{\mu})(a) = \delta\_{\lambda\mu}\delta\_{\lambda}(a) = \delta\_{\lambda\mu}e\_{\lambda},\tag{A.60}$$

which shows that the *e*<sup>λ</sup> are mutually orthogonal. Second, we compute

$$\lambda a e\_{\lambda} \Psi = a \delta\_{\lambda}(a) \Psi = \mathrm{id}\_{\sigma(a)}(a) \delta\_{\lambda}(a) \Psi = (\mathrm{id}\_{\sigma(a)} \cdot \delta\_{\lambda})(a) \Psi = \lambda \cdot \delta\_{\lambda}(a) \Psi = \lambda e\_{\lambda} \Psi,$$

which shows that *e*λ*H* ⊆ *H*<sup>λ</sup> . Third, (A.60) and (A.59) give ⊕λ∈σ(*a*)*e*λ*H* = *H*, which together with the second step gives *e*λ*H* = *H*<sup>λ</sup> . Hence the *e*<sup>λ</sup> are indeed the spectral projections of *a*. Since we have already proved (A.37) - (A.38), we conclude that Theorem A.10 follows from the first part of Theorem A.15. By the argument in the main proof above, this first part then also yields the second part.

The generalization of Theorem A.15 to a family *a* = (*a*1,...,*an*) of commuting self-adjoint operators is as follows.

Definition A.16. *Let a* = (*a*1,...,*an*) *be commuting self-adjoint operators.*


Clearly, we have

$$
\sigma(\underline{a}) \subseteq \sigma(a\_1) \times \dots \times \sigma(a\_n) \subset \mathbb{R}^n. \tag{A.61}
$$

Furthermore, since dim(*H*) < ∞, once again *C*∗(*a*) is just the algebra of complex polynomials in all operators *ai*.

Theorem A.17. *Let a* = (*a*1,...,*an*) *be commuting self-adjoint operators on H. Then the C\*-algebra C*∗(*a*) *generated by these operators is commutative, and:*

*1. There is a unique isomorphism of C\*-algebras*

$$\mathcal{C}(\sigma(\underline{a})) \cong \mathcal{C}^\*(\underline{a}),\tag{A.62}$$

*written f* → (*f*(*a*)*, subject to the following conditions:*


$$\mathcal{C}^\*(\underline{a}) = \mathcal{C}^\*(e^{(a\_l)}\_{\lambda\_l}, i = 1, \dots, n, \lambda\_l \in \sigma(a\_l)). \tag{A.63}$$

*3. If for each* λ ∈ σ(*a*) *we define the operator*

$$e\_{\underline{\lambda}} = e\_{\lambda\_1}^{(a\_1)} \cdots e\_{\lambda\_n}^{(a\_n)},\tag{A.64}$$

*then e*<sup>λ</sup> *is a projection, in terms of which the joint spectrum may be rewritten as*

$$\sigma(\underline{a}) = \{ \underline{\lambda} \in \sigma(a\_1) \times \dots \times \sigma(a\_n) \mid e\_{\underline{\lambda}} \neq 0 \}. \tag{A.65}$$

*4. Finally, we have*

$$C^\*(\underline{a}) = C^\*(e\_{\underline{\lambda}}, \underline{\lambda} \in \sigma(\underline{a})) = \text{span}(e\_{\underline{\lambda}}, \underline{\lambda} \in \sigma(\underline{a})).\tag{A.66}$$

We will not prove this in any detail, as the reasoning is quite analogous to the proof of Theorem A.15; for example, in (A.56) one just has to replace *a* by *a*. The only nontrivial point is that since all *ai* commute, so do all their spectral projections *e* (*i*) λ*i* ; this follows from (A.51), which makes these operators elements of the *commutative* C\*-algebra *C*∗(*a*) (which by definition contains each *C*∗(*ai*) and, in fact, is just the smallest C\*-algebra in *B*(*H*) with this property). Using (A.38) for each *ai* and multiplying the *n* versions of the unit 1*<sup>H</sup>* thus obtained with each other, yields

$$H = \bigoplus\_{\underline{\lambda} \in \sigma(\underline{a})} H\_{\underline{\lambda}}.\tag{A.67}$$

Since *p*(*a*)υλ = *p*(λ)υλ for each joint eigenvector υλ ∈ *H*<sup>λ</sup> , eq. (A.67) gives injectivity of the map (A.56) (*mutatis mutandis*) by the same argument as for *n* = 1.

This leads to a multi-spectral theorem for the commuting family *a*, which is most conveniently stated in the following form. First, for any polynomial

$$p(\mathbf{x}\_1, \dots, \mathbf{x}\_n) = \sum\_{k\_1, \dots, k\_n} \boldsymbol{\alpha}\_1^{k\_1} \cdots \boldsymbol{\alpha}\_n^{k\_n},\tag{A.68}$$

in *n* real variables, we generalize (A.52) to

$$p(\underline{a}) = \sum\_{k\_1,\ldots,k\_n} a\_1^{k\_1} \cdots a\_n^{k\_n}.\tag{A.69}$$

Theorem A.18. *Let a* = (*a*1,...,*an*) *be commuting self-adjoint operators on H. Then for any polynomial p in n real variables, with associated operator* (A.69)*,*

$$p(\underline{a}) = \sum\_{\underline{\lambda} \in \sigma(\underline{a})} p(\underline{\lambda}) \cdot e\_{\underline{\lambda}},\tag{A.70}$$

*where the spectral projections e*<sup>λ</sup> *are given by* (A.64)*.*

The special case *p*(*x*1,..., *xn*) then recovers (A.67). As for *n* = 1, eq. (A.70) may be generalized to arbitrary continuous functions *f*(*x*1,..., *xn*), either by replacing *f* by a polynomial that coincides with *f* on the joint spectrum σ(*a*), or by approximating *f* by polynomials on some compact set *K* containing σ(*a*).

Proposition A.19. *Let a* = (*a*1,...,*an*) *be a family of commuting self-adjoint operators on H. Then there is a self-adjoint operator a* ∈ *B*(*H*) *such that C*∗(*a*) =*C*∗(*a*)*.*

*Proof.* Take *a* = ∑λ∈σ(*a*) *c*<sup>λ</sup> *e*<sup>λ</sup> , with all *c*<sup>λ</sup> different from each other. Then

$$\mathcal{C}^\*(a) = \mathcal{C}^\*(e\_{\underline{\lambda}}, \underline{\lambda} \in \sigma(\underline{a})),\tag{A.71}$$

by (A.50), and hence the claim follows from (A.66). -

Corollary A.20. *Every (unital) commutative C\*-algebra C in B*(*H*) *is generated by a single self-adjoint operator a (and the unit* 1*H), i.e., C* = *C*∗(*a*)*.*

*Proof.* Just take a basis (*ck*) of *C* as a vector space and decompose *ck* = *ak* + *ia k* with *ak* and *a <sup>k</sup>* self-adjoint (namely, *ak* = <sup>1</sup> <sup>2</sup> (*ck* +*c*<sup>∗</sup> *<sup>k</sup>* ) and *a <sup>k</sup>* = −<sup>1</sup> <sup>2</sup> *i*(*ck* −*c*<sup>∗</sup> *<sup>k</sup>* )). If *C* is to be commutative, each *ck* must be normal, i.e., *c*<sup>∗</sup> *<sup>k</sup> ck* = *ckc*<sup>∗</sup> *<sup>k</sup>* , which is equivalent to commutativity of *ak* and *a <sup>k</sup>*, and all *ck* must commute, i.e., all *ak* and *a <sup>k</sup>* must commute for different *k*. Hence *C* = *C*∗(*ak*,*a <sup>k</sup>*), which is of the form *C*∗(*a*) for an appropriate family *a*, and so by Proposition A.19 it takes the form *C*∗(*a*). -

We say that a unital commutative C\*-algebra *C* ⊂ *B*(*H*) is *maximal* if it is not contained in some bigger unital commutative C\*-algebra in *B*(*H*). Also, we call a self-adjoint operator *a maximal* iff σ(*a*) has cardinality dim(*H*), or, in other words, if each eigenvalue of *a* is nondegenerate. In finite dimension it is easy to classify maximal unital commutative C\*-algebras in *B*(*H*) *up to unitary equivalence*.

Here we say (as usual) that a linear map *u* : *H* → *H* is *unitary* when it is invertible and satisfies *u*ϕ,*u*ψ <sup>=</sup> ϕ,ψ for each <sup>ϕ</sup>,<sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>* (note that the inverse *<sup>u</sup>*−<sup>1</sup> is automatically linear). Two <sup>∗</sup>-algebras *C* ⊂ *B*(*H*) and *C* ⊂ *B*(*H* ) are called *unitarily equivalent*, then, if there is a unitary map *<sup>u</sup>* : *<sup>H</sup>* <sup>→</sup> *<sup>H</sup>* such that *<sup>C</sup>* <sup>=</sup> *uCu*−1.

Theorem A.21. *A unital commutative C\*-algebra C* ⊂ *B*(*H*) *is maximal iff it is unitarily equivalent to the algebra Dn*(C) *of all diagonal matrices on H* = C*n.*

*Proof.* First, *Dn*(C) is indeed maximal abelian in *Mn*(C); any extension of *Dn*(C) would have to contain some additional matrix *b* ∈ *Mn*(C) that commutes with all *a* ∈ *Dn*(C), but by elementary linear algebra this very property implies *b* ∈ *Dn*(C).

By Corollary A.20, we have *C* = *C*∗(*a*), where *a*∗ = *a*. Then *C* is maximal iff *a* is maximal. For if not, some eigenvalue λ ∈ σ(*a*) would have multiplicity *m*<sup>λ</sup> > 1, and hence the corresponding spectral projection *e*<sup>λ</sup> could be decomposed as *e*<sup>λ</sup> = *e* (1) <sup>λ</sup> + *e* (2) <sup>λ</sup> , where both terms are orthogonal and hence commute. We could then extend *C*∗(*a*), as in (A.50), to *C*∗(*e*<sup>λ</sup> , *e* (1) <sup>λ</sup> , *e* (2) <sup>λ</sup> ,λ ∈ σ(*a*),λ = λ ), which remains commutative, and we have a contradiction with the alleged maximality of *C*∗(*a*).

Thus *a* is maximal, in which case we list the spectrum as σ(*a*) = {λ1,...,λ*n*}, with corresponding eigenvectors {υλ<sup>1</sup> ,...,υλ*<sup>n</sup>* }. This gives rise to a unitary map *<sup>u</sup>* : *<sup>H</sup>* <sup>→</sup> <sup>C</sup>*<sup>n</sup>* defined by *<sup>u</sup>*υλ*<sup>i</sup>* <sup>=</sup> <sup>υ</sup>*i*, where (υ1,...,υ*n*) is the standard basis of <sup>C</sup>*n*, and clearly *uau*−<sup>1</sup> <sup>=</sup> diag(λ1,...,λ*n*). If (as is the case) all entries <sup>λ</sup>*<sup>i</sup>* <sup>∈</sup> <sup>R</sup> are different, any (*z*1,...,*zn*) <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* may be written as *zi* <sup>=</sup> *<sup>p</sup>*(λ*i*), *<sup>i</sup>* <sup>=</sup> <sup>1</sup>,...,*n*, where *<sup>p</sup>* is some complex polynomial *<sup>p</sup>*(*x*) = <sup>∑</sup>*<sup>i</sup> cix<sup>n</sup>*, *<sup>x</sup>* <sup>∈</sup> <sup>R</sup>, *ci* <sup>∈</sup> <sup>C</sup>. Hence *uC*∗(*a*)*u*−<sup>1</sup> <sup>=</sup> *Dn*(C). -

#### A.5 Positive operators and the trace

Operators *a* : *H* → *H* satisfying one (and hence all) of the conditions in the next proposition are called *positive*, written *a* ≥ 0 or 0 ≤ *a*. More generally, we write *a* ≤ *b* iff *b*−*a* ≥ 0. Positive operators play a very important role in quantum mechanics.

Proposition A.22. *The following conditions on an operator a are equivalent:*

*1.* ψ,*a*ψ ≥ 0 *for arbitrary* ψ ∈ *H. 2. a*<sup>∗</sup> <sup>=</sup> *a and* <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup>+*. 3. a* = *c*<sup>2</sup> *for some self-adjoint operator c. 4. a* = *b*∗*b for some operator b.*

*Proof.* 1 → 2: Putting ψ,*a*ψ ≥ 0 in (A.44) gives ψ,*a*ψ = ψ,*a*∗ψ for all ψ. But for any operator *b* and vectors χ,ϕ ∈ *H*, as in (A.10) we have the identity

$$\begin{split} 4\langle \chi, b\phi \rangle &= \langle \chi + \Phi, b(\chi + \Phi) \rangle - \langle \chi - \Phi, b(\chi + i\Phi) \rangle \\ &+ i\langle \chi - i\Phi, b(\chi - i\Phi) \rangle - i\langle \chi + i\Phi, b(\chi + i\Phi) \rangle. \end{split} \tag{A.72}$$

So *b* = 0 iff ψ,*b*ψ = 0 for all ψ ∈ *H*, and hence condition 1 implies *a*<sup>∗</sup> = *a*. We therefore know that σ(*a*) ⊂ R, and since an eigenvalue λ < 0 would contradict the first condition 1, the second condition follows.

<sup>2</sup> <sup>→</sup> 3: define *<sup>c</sup>* <sup>=</sup> <sup>√</sup>*a*, where (since <sup>λ</sup>*<sup>i</sup>* <sup>≥</sup> 0) the square root is (well) defined by

$$
\sqrt{a} = \sum\_{i=1}^{\dim(H)} \sqrt{\lambda\_i} |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i|. \tag{A.73}
$$

<sup>3</sup> <sup>→</sup> 4 is trivial (take *<sup>b</sup>* <sup>=</sup> *<sup>c</sup>*), as is 4 <sup>→</sup> 1, since ψ,*a*ψ <sup>=</sup> *b*ψ2. -

Combining this with Proposition A.5, we obtain the following result.

Proposition A.23. *The relationship* ϕ,ψ = ϕ,*a*ψ *gives a bijective correspondence between (hermitian/positive) sesquilinear forms* ·,· *on H and (hermitian/positive) operators a on H.*

*Proof.* One direction is trivial. For the other, fix χ ∈ *H* and define a functional *f*(ψ) = χ,ψ . By Proposition A.5, *f* = *f*<sup>ϕ</sup> for some unique ϕ ∈ *H*. Define an operator *b* : *H* → *H* by *b*χ = ψ and put *a* = *b*∗. -

Proposition A.24. *Any self-adjoint operator a* ∈ *B*(*H*) *has a decomposition*

$$a = a\_+ - a\_-,\tag{A.74}$$

*where a*<sup>±</sup> ≥ 0*. These are unique if they also satisfy a*+*a*<sup>−</sup> = *a*−*a*<sup>+</sup> = 0*.*

*Proof.* Using Theorem A.10, we may take

$$a\_{\pm} = \pm \sum\_{\lambda \in \sigma(a) \cap \mathbb{R}^{\pm}} \lambda \cdot e\_{\lambda} \,. \tag{A.75}$$

Equivalently, we may use Theorem A.15 to rewrite (A.75) as

$$a\_{\pm} = (|\mathrm{id}\_{\sigma(a)}| \cdot 1\_{\mathbb{R}^{\pm}})(a) \equiv f\_{\pm}(a),\tag{A.76}$$

where <sup>|</sup>idσ(*a*)<sup>|</sup> is the function <sup>λ</sup> → |λ|, <sup>R</sup><sup>+</sup> = [0,∞) and <sup>R</sup><sup>−</sup> = (−∞,0). To prove uniqueness, we note that since σ(*a*) ⊂ R is finite, there is a polynomial *p* such that *f*<sup>+</sup> = *p*, and hence *a*<sup>+</sup> = *p*(*a*). If *a* = *a* <sup>+</sup> −*a* <sup>−</sup> with *a* <sup>±</sup> ≥ 0 and *a* +*a* <sup>−</sup> = *a* −*a* <sup>+</sup> = 0, then for any polynomial *p* we have *p*(*a*) = *p*(*a* +)+ *p*(−*a* <sup>−</sup>). For the one just taken, this gives *p*(*a*) = *a* <sup>+</sup> by positivity of the *a* <sup>±</sup>, and hence *a* <sup>+</sup> = *a*+, etc. -

We now introduce a construction of great significance to quantum mechanics.

Lemma A.25. *If* (υ*i*) *and* (υ *<sup>i</sup>*) *are bases of H, then for any operator a* : *H* → *H,*

$$\sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle = \sum\_{i} \langle \mathfrak{v}\_{i}^{\prime}, a\mathfrak{v}\_{i}^{\prime} \rangle .$$

*Proof.* A simple computational proof uses the identity (A.40) for any basis (υ*i*) (i.e., the υ*<sup>i</sup>* need not be eigenvectors of *a*, as in (A.39)). Then, as in physics books,

$$\sum\_{i} \langle \mathfrak{v}'\_{i}, a\mathfrak{v}'\_{i} \rangle = \sum\_{i,j,k} \langle \mathfrak{v}\_{k}, \mathfrak{v}'\_{i} \rangle \langle \mathfrak{v}'\_{i}, \mathfrak{v}\_{j} \rangle \langle \mathfrak{v}\_{j}, a\mathfrak{v}\_{k} \rangle = \sum\_{j,k} \langle \mathfrak{v}\_{k}, \mathfrak{v}\_{j} \rangle \langle \mathfrak{v}\_{j}, a\mathfrak{v}\_{k} \rangle = \sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle \angle \mathfrak{v}\_{k}$$

This lemma allows us to define the *trace* of *a* by

$$\operatorname{Tr}(a) = \sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle,\tag{A.77}$$

where (υ*i*) is any basis of *H*. By almost the same proof as Lemma A.25 we obtain

$$\operatorname{Tr}(ab) = \sum\_{i,j} \langle \mathfrak{v}\_i, a\mathfrak{v}\_j \rangle \langle \mathfrak{v}\_j, b\mathfrak{v}\_i \rangle = \sum\_{i,j} \langle \mathfrak{v}\_i, b\mathfrak{v}\_j \rangle \langle \mathfrak{v}\_j, a\mathfrak{v}\_i \rangle = \operatorname{Tr}(ba). \tag{A.78}$$

If *u* is *unitary* (in that *uu*∗ = *u*∗*u* = 1,) then from either Lemma A.25 or eq. (A.78),

$$\operatorname{Tr}\left(\mu a u^\*\right) = \operatorname{Tr}\left(a\right). \tag{A.79}$$

Finally, if *a*∗ = *a*, then (A.37) and taking the trace over the basis in (A.39) yields

$$\operatorname{Tr}\left(a\right) = \sum\_{\lambda \in \sigma(a)} m\_{\lambda} \cdot \lambda \,. \tag{A.80}$$

Definition A.26. *A* density operator *is a positive operator* ρ *on H such that*

$$\text{Tr}(\mathfrak{p}) = 1.\tag{A.81}$$

The analysis of density operators hinges on the introduction of a second operator norm, beside the canonical one (A.18). In finite dimension these norms are equivalent, but in general they are not, and it makes sense to introduce both already here.

For any *a* ∈ *B*(*H*), the operator *a*∗*a* is positive and hence self-adjoint, so that

A.5 Positive operators and the trace 509

$$a^\*a = \sum\_{\mu \in \sigma(a^\*a)} \mu e\_\mu = \sum\_{i=1}^n \mu\_i |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i| \tag{A.82}$$

for certain eigenvalues μ*<sup>i</sup>* ≥ 0 (including possible multiplicities) or μ ∈ σ(*a*∗*a*) (excluding multiplicities), all necessarily non-negative by positivity of *a*∗*a*, and some normalized eigenvectors υ*<sup>i</sup>* or spectral projections *e*<sup>μ</sup> ; cf. (A.37) - (A.39). Then put

$$\|a\|\_1 = \sum\_{\mu \in \sigma(a^\*a)} \sqrt{\mu} m\_\mu = \sum\_{i=1}^n \sqrt{\mu\_i}.\tag{A.83}$$

It is not immediately clear that ·<sup>1</sup> *is* a norm on *B*(*H*), but we will shortly prove that it is; we provisionally refer to *B*(*H*), equipped with the norm (A.83), as *B*1(*H*).

Another way to defined this *trace-norm* is to first introduce the *absolute value*

$$|a| = \sqrt{a^\*a} \tag{A.84}$$

of any operator *a* ∈ *B*(*H*), where the square root is simply defined as

$$\sqrt{a^\*a} = \sum\_{\mu \in \sigma(a^\*a)} \sqrt{\mu}e\_{\mu} = \sum\_{i=1}^n \sqrt{\mu\_i} |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i|,\tag{A.85}$$

which coincides with *f*(*a*∗*a*) for *f*(*x*) = √*x* as defined in Theorem A.15, see (A.57). If *a* is positive, then |*a*| = *a*. Some other useful properties of the absolute value are

$$\ker|a| = \ker a = (\text{ran}\,|a|)^\perp;\tag{A.86}$$

$$\|\|a|\Psi\| = \|a\Psi\|,\ \Psi \in H. \tag{A.87}$$

For the first equality in (A.86),

$$a\psi = 0 \Rightarrow a^\*a\psi = 0 \Leftrightarrow \sqrt{a^\*a}\psi = 0 \Leftrightarrow |a|\psi = 0,$$
 
$$\Box$$

but also *<sup>a</sup>*∗*a*<sup>ψ</sup> <sup>=</sup> <sup>0</sup> ⇒ ψ,*a*∗*a*ψ <sup>=</sup> <sup>0</sup> ⇔ *a*ψ<sup>2</sup> <sup>=</sup> <sup>0</sup> <sup>⇔</sup> *<sup>a</sup>*<sup>ψ</sup> <sup>=</sup> 0. For the second,

$$\ker a = (\text{ran}\,a^\*)^\perp,\tag{A.88}$$

which in turn is immediate from the definition of the adjoint. Eq. (A.87) is similar.

Though once again lacking transparency as a norm, by construction we now have

$$\|\|a\|\|\_{1} = \text{Tr}\left(|a|\right),\tag{A.89}$$

so if (λ*i*) are the (positive) eigenvalues of |*a*|, including multiplicities, then

$$||a||\_1 = \sum\_{l=1}^{n} \lambda\_l.\tag{A.90}$$

To obtain suitable estimates for the trace norm we need some further techniques.

Definition A.27. *Let H be a finite-dimensional Hilbert space.*


For immediate and later reference, we collect some properties of such operators.

Lemma A.28. *Let H be a Hilbert space with a partial isometry u* ∈ *B*(*H*)*.*


The proof is an easy verification. In the infinite-dimensional case, a distinction arises between isometries (i.e, *injective* partial isometries, so that *u*∗*u* = 1*H*) and unitaries, but if dim(*H*) < ∞, injectivity implies subjectivity and hence bijectivity.

We now come to von Neumann's highly convenient *polar decomposition* of an operator, which mimics the polar decomposition *z* = *r* exp(*i*ϕ) of *z* ∈ C.

#### Proposition A.29. *For a* ∈ *B*(*H*)*, assumed nonzero, the operator u given by*

$$a|a|\Psi = a\Psi,\ \ (|a|\Psi \in \text{ran}\,|a|);\tag{A.91}$$

$$
\mu\Psi = 0,\ (\Psi \in (\text{ran}\,|a|)^\perp = \text{ker}\,|a|) : \tag{A.92}
$$


$$\|\|a|\Psi\| = \|a\Psi\|;\tag{A.93}$$

$$|u^\*u|a| = |a| = |a|\mu^\*\mu. \tag{A.94}$$

*Given that u is a partial isometry, it is characterized by the two properties:*

$$
\ker \mathfrak{u} = \ker a;\tag{A.95}$$

$$a = \mu |a|. \tag{A.96}$$

*Furthermore, if a* = 0*, then a is invertible iff u is unitary.*

*Proof.* This follows from (A.86) - (A.87), except the claim that (A.95) - (A.96) uniquely define *u*, which we will not use and whose proof we therefore omit. -

Recall from the easy Theorem 2.7 that there is a bijective correspondence between linear maps ω : *B*(*H*) → C and operators ρ ∈ *B*1(*H*), given by (2.33), i.e.,

$$
\mathfrak{w}(a) = \text{Tr}\,(\mathfrak{\rho}a). \tag{A.97}
$$

Proposition A.30. *If H is finite-dimensional, the map* ω → ρ *from B*(*H*)<sup>∗</sup> *to B*1(*H*)*, defined by* (A.97) *gives an isometric isomorphism of Banach spaces*

$$B(H)^\* \cong B\_1(H);\tag{A.98}$$

*in particular, one has*

$$\|\|\boldsymbol{\varrho}\|\| = \|\|\boldsymbol{\varrho}\|\|\_1. \tag{A.99}$$

*Proof.* Bijectivity being known already, the basic estimate towards (A.99) is

$$|\text{Tr}\,(\rho a)| \le ||\rho||\_1 ||a||. \tag{A.100}$$

This follows from the polar decomposition ρ = *u*|ρ| and the spectral decomposition

$$|\rho| = \sum\_{i=1}^{m \le n} p\_i |\mathfrak{v}\_i\rangle\langle\mathfrak{v}\_i|,\tag{A.101}$$

where *pi* > 0 (but not necessarily ∑*<sup>i</sup> pi* = 1). Assuming ρ = 0, using (A.101), (A.78), Cauchy–Schwarz, (A.20), (A.21), *u* = υ*i* = 1, and (A.90), we indeed have

$$|\text{Tr}\,(\rho a)| = |\text{Tr}\,(u|\rho|a)| = |\text{Tr}\,(|\rho|au)| = |\sum\_{i} p\_i \langle \mathfrak{v}\_i, a\mu \mathfrak{v}\_i \rangle|\tag{A.102}$$

$$0 \le \sum\_{i} p\_i |\langle \mathfrak{v}\_i, a\mu \mathfrak{v}\_i \rangle| \le \sum\_{i} p\_i ||a|| ||\mathfrak{u}|| ||\mathfrak{v}\_i|| = ||\mathfrak{p}||\_1 ||a||. \quad \text{(A.103)}$$

To prove saturation of this bound, take *a* = *u*∗, which is isometric on the space ran|ρ| = span(υ1,...,υ*m*) and hence satisfies *a* = 1 as well as υ*i*,*au*υ*i*=1. Consequently, from (A.102) we find |Tr(ρ*a*)| = ∑*<sup>i</sup> pi*. By (A.90) for ρ instead of *a*, i.e., ρ<sup>1</sup> = Tr(|ρ|) = ∑*<sup>i</sup> pi*, we obtain |Tr(ρ*a*)| = ρ1, which yields (A.99). -

Corollary A.31. *The trace-norm* ·<sup>1</sup> *is (indeed) a norm on B*1(*H*)*.*

As explained in more detail in §B.9, for any vector space *V* with norm, with double dual *V*∗∗, we have a canonical map *V* → *V*∗∗ given by *v* → *v*ˆ, where

$$
\hat{\nu}(\theta) = \theta(\nu),
\tag{A.104}
$$

where *v* ∈*V*, ˆ*v* ∈*V*∗∗, and θ ∈*V*∗. By the general theory, this map is always isometric (and hence injective), and if *V* is finite-dimensional, it is also surjective and hence an isomorphism. Therefore, taking *V* = *B*(*H*), we infer from (A.98) that

$$B\_1(H)^\* \cong B(H),\tag{A.105}$$

where *a* ∈ *B*(*H*) corresponds to ˆ*a* ∈ *B*1(*H*)<sup>∗</sup> by means of

$$
\hat{a}(\rho) = \text{Tr}\,(\rho a). \tag{A.106}
$$

This new role of *B*(*H*) as the dual of *B*1(*H*) also equips it with a new topology (besides the norm topology it already has), viz. the accompanying *w*∗-topology.

This topology is defined by saying that *an* → *a* iff ˆ*an*(ρ) → *a*ˆ(ρ) for each ρ ∈ *B*1(*H*). For historical reasons this is called the σ*-weak* topology on *B*(*H*), so we say that *an* → *a* σ*-weakly* in *B*(*H*) iff Tr(ρ*an*) → Tr(ρ*a*) for each ρ ∈ *B*1(*H*).

To close, it is interesting to ut the trace-norm into a classical perspective. As explained in Chapter 1, at least on finite-dimensional Hilbert spaces, density operators are the quantum counterparts of probability measures (or distributions). If *X* is a *finite set*, the associated function space *C*(*X*) carries the *supremum-norm*

$$\|f\|\_{\infty} = \sup\{|f(\mathbf{x})|, \mathbf{x} \in X\},\tag{A.107}$$

cf. (1.24). We equip the space *C*(*X*)<sup>∗</sup> of all linear maps ω :*C*(*X*) → C with the norm

$$\|\|\boldsymbol{\varrho}\|\| = \sup\{ |\boldsymbol{\varrho}(f)|, f \in \mathcal{C}(X), \|f\|\|\_{\\*\*\infty} = 1 \}.\tag{A.108}$$

Let *<sup>L</sup>*1(*X*) be the vector space of all functions <sup>ρ</sup> : *<sup>X</sup>* <sup>→</sup> <sup>C</sup>, equipped with the norm

$$\|\|\boldsymbol{\rho}\|\|\_{1} = \sum\_{\mathbf{x}\in\mathcal{X}} |\boldsymbol{\rho}(\mathbf{x})|.\tag{A.109}$$

As in the quantum case just discussed, even for finite *X* it is not immediate that this expression indeed defines a norm; this follows from the next proposition.

Each <sup>ρ</sup> <sup>∈</sup> *<sup>L</sup>*1(*X*) defines a linear map <sup>ω</sup> : *<sup>C</sup>*(*X*) <sup>→</sup> <sup>C</sup> by

$$\mathfrak{o}(f) = \sum\_{\mathbf{x} \in \mathcal{X}} \mathfrak{p}(\mathbf{x}) f(\mathbf{x}). \tag{A.110}$$

Conversely, each <sup>ω</sup> <sup>∈</sup> *<sup>C</sup>*(*X*)<sup>∗</sup> defines an element <sup>ρ</sup> <sup>∈</sup> *<sup>L</sup>*1(*X*) by

$$
\rho(\mathbf{x}) = \mathfrak{o}(\delta\_{\mathbf{x}}),
\tag{A.111}
$$

with δ*<sup>x</sup>* ∈ *C*(*X*) defined by δ*x*(*y*) = δ*xy* as usual.

Proposition A.32. *If X is finite, the map* <sup>ω</sup> → <sup>ρ</sup> *from C*(*X*)<sup>∗</sup> *to L*1(*X*)*, defined by* (A.111)*, has inverse* (A.110) *and gives an isometric isomorphism*

$$C(X)^\* \cong L^1(X) \tag{A.112}$$

*of Banach spaces; in particular, one has*

$$\|\|\boldsymbol{\varrho}\|\| = \|\boldsymbol{\varrho}\|\|\_{1}.\tag{A.113}$$

*Proof.* The vector space isomorphism in question can be checked effortlessly. To verify (A.113), note that trivially |ω(*f*)|≤ρ1 *f* ∞, whence ω≤ρ1. To show saturation of this bound, given <sup>ρ</sup> <sup>∈</sup> *<sup>L</sup>*1(*X*) take *<sup>f</sup>*(*x*) = <sup>|</sup>ρ(*x*)|/ρ(*x*) if <sup>ρ</sup>(*x*) <sup>=</sup> <sup>0</sup> and *f*(*x*) = 0 elsewhere; if ρ = 0 this gives *f* <sup>∞</sup> = 1 and |ω(*f*)| = ρ1. - Notes 513

## Notes

The material in this appendix has been collected from numerous functional analysis books (some of which are mentioned in the Notes to the next appendix), adapted to the finite-dimensional case. Though not used in preparing this text, Halmos (1958, 1970) are classics. Theorem A.3 is due to Jordan & von Neumann (1935); Amir (1986) contains many other characterizations of inner product spaces.

## Appendix B Basic functional analysis

This appendix contains all technical information on general Hilbert spaces (as opposed to the finite-dimensional ones of the previous appendix) and, more generally, infinite-dimensional Banach spaces, that is either directly needed in the main text, or forms necessary preparation for the next appendix on operator algebras (which in turn play a central role in this book). Since most interesting examples of both Hilbert spaces and more general Banach spaces require some measure theory, which at the same time provides the mathematical foundation of probability theory, we include a brief introductory overview to this area as well (restricted, though, to the case we need, viz. measures and integrals on locally compact spaces).

Functional analysis has its roots in both mathematics and physics. In particular, the general area of *spectral theory*, which emerged during the period 1900-1930 in the hands of Hilbert and his school, largely owes its existence to mathematical physics, as well as to Hilbert's genius in finding the right combination of examples and abstract theory (including his innovative definition of the spectrum). Hilbert's school culminated in the books *Methoden der mathematischen Physik* by Courant and Hilbert (1924), *Gruppentheorie und Quantenmechanik* by Weyl (1928), and *Mathematische Grundlagen der Quantenmechanik* by von Neumann (1932), all of whom were at Gottingen at the time (as were such giants in the history of quan- ¨ tum mechanics like Born, Heisenberg, and Jordan). Whereas Courant & Hilbert at least *thought* they described classical physics (although it soon turned out that their discussion of eigenvalue problems paved the way for the Schrodinger equation ¨ discovered two years later), von Neumann explicitly developed the Hilbert space formalism in order to describe quantum physics (for example, the modern abstract definition of a Hilbert space was his), as did Weyl (in connection with group theory).

What seems to have come from pure mathematics, though, is the idea, central to functional analysis, of looking at functions as points in some (infinite-dimensional) vector space. This emerged from the French school of Hadamard and his student Frechet, requiring considerable interaction between the (then) new fields of linear ´ algebra and topology. Eventually, this also led to the fundamental work of Banach.

We hope that the combination of logical setup, examples, theorems, and proofs in this appendix helps convince the reader of the sober elegance of functional analysis.

#### B.1 Completeness

A notable difference between finite-dimensional vector spaces with norm and infinite-dimensional ones is that the former are always *complete* in a sense to be defined now, whereas the latter may or may not be. This distinction has major consequences, especially where idealizations (and hence limits) are concerned.

As before, all vector spaces are defined over C (unless stated otherwise).

Definition B.1. *Let V be a vector space (or, more generally, a set). <sup>A</sup>* metric *on V is a function d* : *<sup>V</sup>* <sup>×</sup>*<sup>V</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> *satisfying, for all f*,*g*,*<sup>h</sup>* <sup>∈</sup> *V :*

*1. d*(*f*,*g*) ≤ *d*(*f*,*h*) +*d*(*h*,*g*) *(*triangle inequality*); 2. d*(*f*,*g*) = *d*(*g*, *f*) *for all f*,*g* ∈ *V (*symmetry*); 3. d*(*f*,*g*) = 0 *iff f* = *g (*positive definiteness*).*

Our main example is a vector space *V* with norm ·, which, as an easy exercise shows, gives rise to a metric on *V* via

$$d(f, \mathbf{g}) = \|f - \mathbf{g}\|. \tag{\text{B.I}}$$

In particular, an inner product on *V* induces a metric on *V* through (A.2) and (B.1).

The reader should have some experience with metric spaces from an undergraduate Analysis course, but for convenience we repeat the definition of completeness.


A convergent sequence is Cauchy: from the triangle inequality and symmetry one has *d*(*vn*, *vm*) ≤ *d*(*vn*, *v*) + *d*(*vm*, *v*), so for given ε > 0 there is *N* ∈ N such that *d*(*vn*, *v*) < ε/2, et cetera. However, the converse statement does not hold in general: for example, take the vector space *<sup>c</sup>*(N) of all functions *f* : N → C that are zero expect at finitely many places (with the obvious pointwise operations), or, equivalently, the vector space C<sup>∞</sup> of all sequences (*xn*) with finitely many nonzero entries. This vector space is incomplete in any conceivable norm, like the sup-norm

$$\|f\|\_{\circ} = \sup\{|f(\mathbf{x})|, \mathbf{x} \in \mathbb{N}\}.\tag{\mathbf{B}.2}$$

Indeed, the sequence (*fn*), where *fn*(*x*) = 1/*x* for *x* = 1,...,*n* and *f*(*x*) = 0 for *x* > *n*, which corresponds to the sequence (1,1/2,1/3,...,1/*n*,0,0,...) in C<sup>∞</sup> is Cauchy, but its obvious limit *f*(*x*) = 1/*x* for *each x* ∈ N, or *xn* = 1/*n*, does not lie in *c*(N).

Definition B.3. • *A* Banach space *is a vector space with norm that is complete in the associated metric* (B.1)*.*

• *A* Hilbert space *is vector space with inner product that is complete in the associated metric* (B.1)*, in which the norm is defined by* (A.2)*. Equivalently, a Hilbert space is a Banach space whose norm comes from an inner product via* (A.2)*.*

As we have seen, *<sup>c</sup>*(N) fails to be a Banach space in the sup-norm, but (its completion) -<sup>∞</sup>(N), which consists of all bounded functions *<sup>f</sup>* : <sup>N</sup> <sup>→</sup> <sup>C</sup>, is (see §B.2).

Definition B.4. *Two norms* · *and* · *on the same vector space V are* equivalent *if there are constants M* > 0 *and m* > 0 *such that for any v* ∈ *V ,*

$$m\|\|\mathbf{v}\|\|' \le \|\mathbf{v}\|\| \le M\|\|\mathbf{v}\|\|'. \tag{\mathbf{B}.3}$$

In that case, the two metric topologies on *X* defined by these norms coincide, so that in particular completeness and convergence in · and · are the same.

Proposition B.5. *Let V be a* finite-dimensional *vector space. All norms on V are equivalent, and hence V is complete in any norm.*

*Proof.* We derive this from a basic fact of Analysis, namely that C*<sup>n</sup>* is complete in the (Euclidean) norm ·<sup>2</sup> derived from the standard inner product (A.11), that is,

$$||z||\_2^2 = \sum\_{i=1}^n |z\_i|^2. \tag{B.4}$$

So the first step is to transfer the problem from *V* to C*n*, where *n* = dim(*V*), by choosing a basis (υ*i*) of *V*, and mapping υ*<sup>i</sup>* to the standard basis vector *ui* of C*n*. Linear extension then maps *<sup>v</sup>* <sup>=</sup> <sup>∑</sup>*izi*υ*<sup>i</sup>* <sup>∈</sup> *<sup>V</sup>* to *<sup>z</sup>* = (*z*1,...,*zn*) <sup>∈</sup> <sup>C</sup>*n*, which gives an isomorphism *<sup>V</sup>* <sup>→</sup> <sup>C</sup>*n*. This maps endows <sup>C</sup>*<sup>n</sup>* with a new norm *z* <sup>=</sup> *v* (i.e. the given norm on *V*), which we now prove to be equivalent to ·<sup>2</sup> ≡· . The second inequality in (B.3) easily follows from Cauchy–Schwarz, viz.

$$||z|| = ||\sum\_{i} z\_i u\_i|| \le \sum\_{i} |z\_i| ||u\_i|| \le \sqrt{\sum\_{i} ||u\_i||^2} \sqrt{\sum\_{j} |z\_i|^2} \equiv \mathcal{M} ||z||\_2 \dots$$

This inequality, together with the elementary but extremely useful estimate

$$||\|\nu\|-\|\mathbf{w}\||\|\le\|\nu-\mathbf{w}\|,\tag{\mathbf{B.5}}$$

which is valid for any norm in any dimension, implies that the function · : <sup>C</sup>*<sup>n</sup>* <sup>→</sup> <sup>R</sup> is continuous with respect to the Euclidean metric on C*n*. Now the unit ball C*<sup>n</sup>* <sup>1</sup> = {*<sup>x</sup>* <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* | *x*<sup>2</sup> <sup>=</sup> <sup>1</sup>} in <sup>C</sup>*<sup>n</sup>* is compact, so according to Weierstrass, the norm · assumes a minimum on C*<sup>n</sup>* <sup>1</sup>. Hence there exists <sup>μ</sup> <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* <sup>1</sup> such that μ≤*z* for all *<sup>z</sup>* <sup>∈</sup> <sup>C</sup>*<sup>n</sup>* <sup>1</sup>. For arbitrary nonzero *<sup>z</sup>* <sup>∈</sup> <sup>C</sup>*n*, the rescaled vector *<sup>z</sup>* <sup>=</sup> *<sup>z</sup>*/*z*<sup>2</sup> lies in <sup>C</sup>*<sup>n</sup>* <sup>1</sup>, so μ≤*z* , which is nothing but the first inequality in (B.3) with *m* = μ. -

#### B.2 *<sup>p</sup>* spaces

The simplest examples of infinite-dimensional Banach spaces are the *<sup>p</sup>*-spaces, where 1 ≤ *p* ≤ ∞ (for *p* < 1 the Minkowski inequality (B.14) below goes in the wrong direction, so that, by failure of the triangle inequality, eq. (B.8) below fails to define a norm). Such spaces are defined on some set *X*, hence we write *<sup>p</sup>*(*X*).

If *X* = {*x*1,..., *xn*} is finite, with cardinality *n* = |*X*|, then *<sup>p</sup>*(*X*) consist of all function *f* : *X* → C with pointwise operations, so that *<sup>p</sup>*(*X*) ∼= C*<sup>n</sup>* as vector spaces through the map *<sup>f</sup>* → (*f*(*x*1),..., *<sup>f</sup>*(*xn*)), where <sup>C</sup>*<sup>n</sup>* is equipped with a specific (and, for *p* = 2, unusual) norm. However, by Proposition (B.5) we may as well take *p* = 2 and nothing has been gained compared with the linear algebra of Appendix A.

Therefore, life starts with infinite sets *X*, and we begin with the simplest of those, viz. *X* = N (but to avoid unnecessary duplication with regard to later generalization, although for the moment we assume *X* = N, we still write *X* for the underlying set). We define *<sup>p</sup>* <sup>≡</sup> *<sup>p</sup>*(*X*) as the set of functions *<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>C</sup> that satisfy

$$\sum\_{\mathbf{x}\in X} |f(\mathbf{x})|^p < \Leftrightarrow (1 \le p < \infty);\tag{B.6}$$

$$\sup\_{\lambda \in X} |f(\lambda)| < \Leftrightarrow (p = \Leftrightarrow). \tag{B.7}$$

As will be shown in far greater generality (cf. Theorem B.9), the point is that for any 1 ≤ *p* ≤ ∞, the set *<sup>p</sup>*(*X*) thus defined is not merely a vector space (under pointwise operations); it is even a Banach space in the norm

$$\|f\|\_{p} = \left(\sum\_{\mathbf{x}\in X} |f(\mathbf{x})|^{p}\right)^{1/p} \text{ ( $1 \le p < \ast$ );}\tag{\text{B.8}}$$

$$\|\|f\|\|\_{\circ} = \sup\{|f(\mathbf{x})|, \mathbf{x} \in X\} = \inf\{C > 0 \mid |f(\mathbf{x})| \le C \,\forall \mathbf{x} \in X\}.\tag{\mathbf{B.9}}$$

The case *p* = 2 is unique in that -<sup>2</sup>(*X*) is also a Hilbert space in the inner product

$$
\langle f, \mathbf{g} \rangle = \sum\_{\mathbf{x} \in X} \overline{f(\mathbf{x})} \mathbf{g}(\mathbf{x}).\tag{\mathbf{B}.10}
$$

As we now outline, these expressions may be generalized to any set, to which end we should define the meaning of (possibly uncountable) sums ∑*x*∈*<sup>X</sup>* . Although the generality below will only be used in §B.12, it is convenient (at little extra cost) to cover more general codomains for *f* than just the complex numbers C.

Definition B.6. *Let X be a set, V a normed vector space, f* : *X* → *V some function, and v* ∈ *V . The sentence* ∑*x*∈*<sup>X</sup> f*(*x*) = *v means that for each* ε > 0 *there is a finite subset F* ⊂ *X such that for each finite subset G* ⊂ *X with F* ⊆ *G, we have*

$$\|\sum\_{\mathbf{x}\in G} f(\mathbf{x}) - \nu\| < \mathfrak{E}.$$

In terms of nets, this means that the net *s* = (*sF*)*F*∈P*f*(*X*) in *V* indexed by finite subsets *F* ⊂ *X* (ordered by inclusion), where *sF*(*x*) = ∑*x*∈*<sup>F</sup> f*(*x*), converges to *v*.

For *X* = N and *V* = C we may take *F* to be {1,...,*N*} and *G* to be {1,...,*n*}, where *n* ≥ *N*, in which case we recover the usual notion of convergence of sums (i.e. <sup>∀</sup><sup>ε</sup> <sup>&</sup>gt; <sup>0</sup>∃*<sup>N</sup>* <sup>∈</sup> <sup>N</sup>∀*<sup>n</sup>* <sup>≥</sup> *<sup>N</sup>* : <sup>|</sup>∑*<sup>n</sup> <sup>x</sup>*=<sup>1</sup> *f*(*x*)−*v*| < ε). However, since also more general *F* and *G* are allowed, Definition B.6 is in fact equivalent to absolute convergence:

Lemma B.7. *Let X be a set and let f* : *X* → C *be some function.*

*1. There exists z* ∈ C *such that* ∑*x*∈*<sup>X</sup> f*(*x*) = *z iff* ∑*x*∈*<sup>X</sup>* | *f*(*x*)| < ∞*. 2. If f*(*x*) ≥ 0 *for each x* ∈ *X, then, in the sense of Definition B.6,*

$$\sum\_{\boldsymbol{\chi}\in\mathcal{X}}f(\boldsymbol{\chi}) = \sup\left\{\sum\_{\boldsymbol{\chi}\in\mathcal{F}}f(\boldsymbol{\chi}), F \subset X \text{ finite} \right\},\tag{B.11}$$

*which is true even if the supremum on the right-hand side is infinite (in which case the left-hand side simply does not converge).*

Therefore, for *f* : *X* → C, one may use (B.11) to check if ∑*x*∈*<sup>X</sup>* | *f*(*x*)| < ∞, in which case it makes sense to try and find the value *v* of ∑*x*∈*<sup>X</sup> f*(*x*) as in Definition B.6.

*Proof.* 1. We write *f* = *f*<sup>1</sup> +*i f*2, with *fi* : *X* → R, and for given *G* ⊂ *X*, write *Gi*<sup>±</sup> = {*x* ∈ *G* | ±*fi*(*x*) ≥ 0} (the ambiguity at those *x* where *f*(*x*) = 0 is irrelevant). Then

$$\begin{split} |\sum\_{x \in G} f(\mathbf{x})| &\leq \sum\_{x \in G} |f(\mathbf{x})| \leq \sum\_{x \in G} |f\_1(\mathbf{x})| + \sum\_{x \in G} |f\_2(\mathbf{x})| \\ &= \sum\_{x \in G\_{1+}} f\_1(\mathbf{x}) - \sum\_{x \in G\_{1-}} f\_1(\mathbf{x}) + \sum\_{x \in G\_{2+}} f\_2(\mathbf{x}) - \sum\_{x \in G\_{2-}} f\_2(\mathbf{x}) \\ &\leq 4 \sup \left\{ |\sum\_{x \in G\_{0}} f(\mathbf{x})|, \mathfrak{a} \in \{1+, 1-, 2+, 2-\} \right\}. \end{split} \tag{\mathbf{B.12}}$$

Using Proposition B.8 below, the first inequality in (B.12) shows that absolute convergence implies convergence in the sense of Cauchy, whereas the last inequality (i.e., ∑*x*∈*<sup>G</sup>* | *f*(*x*)| ≤ 4 sup···) shows the converse.

2. We pick ε > 0 and abbreviate the right-hand side of (B.11) as σ. By definition of the supremum (which we assume finite) there is a finite *F* ⊂ *X* for which σ ≥ ∑*x*∈*<sup>F</sup> f*(*x*) ≥ σ − ε. Since the terms are positive, for any finite *G* ⊇ *F* we have ∑*x*∈*<sup>G</sup> f*(*x*) ≥ ∑*x*∈*<sup>F</sup> f*(*x*) and hence also σ ≥ ∑*x*∈*<sup>G</sup> f*(*x*) ≥ σ −ε, from which |∑*x*∈*<sup>G</sup> f*(*x*)−σ| < ε. Hence ∑*x*∈*<sup>X</sup> f*(*x*) = σ by Definition B.6.

The same argument works if σ = ∞, in which case for any 0 < *M* < ∞ there is a finite *F* ⊂ *X* for which ∑*x*∈*<sup>F</sup> f*(*x*) > *M*, and hence certainly ∑*x*∈*<sup>G</sup> f*(*x*) > *M*. -

Leaving its proof to the reader, we state the Cauchy condition for convergence:

Proposition B.8. *We have* ∑*x*∈*<sup>X</sup> f*(*x*) = *v for some (necessarily unique) v* ∈*V , in the sense of Definition B.6, iff for each* ε > 0 *there is a finite subset F* ⊂ *X such that for each finite subset G* ⊂ *X*\*F we have* ∑*x*∈*<sup>G</sup> f*(*x*) < ε*.*

For uncountable set *X*, Definition B.6 is not as bad as it may sound, since whenever ∑*x*∈*<sup>X</sup>* | *f*(*x*)| < ∞, only a *countable* number of terms can be nonzero (proof by contradiction: if not, there must be an *n* ∈ N for which infinitely many *x* satisfy | *f*(*x*)| > 1/*n* (nested proof by contradiction: if not, then for all *n*, only finitely many *x* satisfy | *f*(*x*)| > 1/*n*, and hence, a countable union of finite sets remaining countable, only a countable number of *x* can have *f*(*x*) = 0), so the sum of | *f*(*x*)| over those *x* alone already diverges). In particular, for *X* = N the sum in (B.6) has its usual meaning. However, even for *X* = N, the sums just defined *only* have their usual meaning if the series in question is absolutely convergent (the standard counterexample of a real series ∑*<sup>n</sup> xn* that is convergent but not absolutely convergent is given by *xn* = (−1)*n*/*n*; in the above light, taking *<sup>G</sup>* <sup>=</sup> *<sup>F</sup>* <sup>∪</sup>*E*, where *<sup>E</sup>* is a large but finite set of even numbers, then makes |∑*i*∈*<sup>G</sup> xn* −*x*| as big as you do not like).

Using the triangle inequality for the norm and the Cauchy criterion for convergence, it is easy to show that if *V* is a Banach space and ∑*x*∈*<sup>X</sup> f*(*x*) < ∞, then the sum ∑*x*∈*<sup>X</sup> f*(*x*) exists in *V* (i.e., it equals some *v* ∈ *V* in the sense of Definition B.6). The implication is one-sided, though: the latter sum may exist even if the former does not. For example, take *V* = -<sup>2</sup>(N), pick some ˜*<sup>f</sup>* <sup>∈</sup> -<sup>2</sup>(N), and define *f* : N → -<sup>2</sup>(N) by *<sup>f</sup>*(*x*) = ˜*f*(*x*)δ*x*, where <sup>δ</sup>*x*(*y*) = <sup>δ</sup>*xy* (and hence δ*x*<sup>2</sup> <sup>=</sup> 1). Then

$$\sum\_{\mathbf{x}\in\mathbb{N}}||f(\mathbf{x})||\_2 = \sum\_{\mathbf{x}\in\mathbb{N}}|\tilde{f}(\mathbf{x})| = ||\tilde{f}||\_1.$$

Now <sup>∑</sup>*x*∈<sup>N</sup> *<sup>f</sup>*(*x*) = ˜*<sup>f</sup>* exists *per* assumption that ˜*<sup>f</sup>* <sup>∈</sup> -<sup>2</sup>(N) and hence ˜*<sup>f</sup>* <sup>2</sup> <sup>&</sup>lt; <sup>∞</sup>, which is implied by, but is not equivalent to ˜*<sup>f</sup>* <sup>1</sup> <sup>&</sup>lt; <sup>∞</sup>. See also §B.12 below.

In any case, the meaning of the possibly uncountable sums in (B.6) and (B.8) should be clear now, as only finite sums (B.11) are involved; for (B.10), by Holder's ¨ inequality (B.15) below for *p* = *q* = 2, the sum in question is absolutely convergent, and hence it falls within the scope of Definition B.6 and Lemma B.7.

Theorem B.9. *For any* 1 ≤ *p* ≤ ∞*, the set <sup>p</sup>*(*X*) *is a vector space under pointwise operations. Moreover, <sup>p</sup>*(*X*) *is a Banach space in the norm* (B.8) *-* (B.9)*.*

*Proof.* 1. *<sup>p</sup> is a vector space.* The case *<sup>p</sup>* <sup>=</sup> <sup>∞</sup> is obvious. For 1 <sup>≤</sup> *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>, use the convexity of the function *<sup>t</sup>* → *<sup>t</sup> <sup>p</sup>* for *<sup>t</sup>* <sup>∈</sup> [0,∞). For convex functions one has *f*( <sup>1</sup> <sup>2</sup> (*t*<sup>1</sup> + *t*2)) ≤ <sup>1</sup> <sup>2</sup> (*f*(*t*1) + *f*(*t*2)), so that ( <sup>1</sup> <sup>2</sup> (*t*<sup>1</sup> <sup>+</sup> *<sup>t</sup>*2))*<sup>p</sup>* <sup>≤</sup> <sup>1</sup> 2 (*t p* <sup>1</sup> + *t p* <sup>2</sup> ). Combined with monotonicity of the function *<sup>t</sup>* → *<sup>t</sup> <sup>p</sup>* on [0,∞), i.e. *<sup>s</sup>* <sup>≤</sup> *<sup>t</sup>* <sup>⇒</sup> *<sup>s</sup><sup>p</sup>* <sup>≤</sup> *<sup>t</sup> <sup>p</sup>*, this gives

$$|f(\mathbf{x}) + \mathbf{g}(\mathbf{x})|^p \le (|f(\mathbf{x})| + |\mathbf{g}(\mathbf{x})|)^p \le 2^{p-1}(|f(\mathbf{x})|^p + |\mathbf{g}(\mathbf{x})|^p),\tag{\mathbf{B.13}}$$

so that summing over *<sup>x</sup>* gives *<sup>f</sup>* <sup>+</sup>*g<sup>p</sup> <sup>p</sup>* <sup>≤</sup> <sup>2</sup>*p*−1( *<sup>f</sup> <sup>p</sup> <sup>p</sup>* <sup>+</sup>*g<sup>p</sup> <sup>p</sup>*) < ∞. Hence if *f* ∈ *<sup>p</sup>* and *<sup>g</sup>* <sup>∈</sup> *<sup>p</sup>*, then *<sup>f</sup>* <sup>+</sup>*<sup>g</sup>* <sup>∈</sup> *p*.

2. ·*<sup>p</sup> is a norm on <sup>p</sup>*. The case *<sup>p</sup>* <sup>=</sup> <sup>∞</sup> is, once again, obvious. For 1 <sup>≤</sup> *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>, the only nontrivial part is the triangle inequality

$$\|f+\mathbf{g}\|\_{p} \le \|f\|\_{p} + \|\mathbf{g}\|\_{p},\tag{\mathbf{B}.14} \tag{\mathbf{B}.14}$$

called the *Minkowski inequality*. This follows from *Holder's inequality ¨* :

$$\|fg\|\_1 \le \|f\|\_p \|\|g\|\_q,\tag{B.15}$$

which is valid for *f* ∈ *<sup>p</sup>* and *<sup>g</sup>* <sup>∈</sup> *<sup>q</sup>*, where 1 <sup>≤</sup> *<sup>p</sup>* <sup>≤</sup> <sup>∞</sup> and 1 <sup>≤</sup> *<sup>q</sup>* <sup>≤</sup> <sup>∞</sup> satisfy

$$\frac{1}{p} + \frac{1}{q} = 1.\tag{\text{B.16}}$$

Thus one has *q* = *p*/(*p* − 1) for 1 < *p* < ∞, or *q* = ∞ for *p* = 1, or *q* = 1 for *p* = ∞. One calls *p* and *q conjugate exponents* (so that *p* = 2 is self-conjugate).

	- a. *Find a candidate f for the limit*. Since (*fk*) is Cauchy, for each ε > 0 there exists *K* ∈ N such that *fk* − *fl<sup>p</sup>* < ε for all *k*,*l* > *K*, or

$$||f\_k - f\_l||\_p^p = \sum\_{\mathbf{x} \in \mathcal{X}} |f\_k(\mathbf{x}) - f\_l(\mathbf{x})|^p < \mathfrak{E}^p. \tag{\mathbf{B}.17}$$

Hence | *fk*(*x*)− *fl*(*x*)| *<sup>p</sup>* < ε *<sup>p</sup>* for all *x*, so (*fk*(*x*))*<sup>k</sup>* is a Cauchy sequence in C. Since C is complete, (*fk*(*x*))*<sup>k</sup>* converges, hence we may define *f* : *X* → C by

$$f(\mathbf{x}) = \lim\_{k \to \infty} f\_k(\mathbf{x}). \tag{\mathbf{B}.18}$$

b. *Show that f* ∈ *<sup>p</sup>*. Note that

$$||\mathbf{g}||\_p^p = \sup\_{F \subset X} \sum\_{\mathbf{x} \in F} |\mathbf{g}(\mathbf{x})|^p,\tag{\mathbf{B}.19}$$

where the supremum is over all finite subsets *F* ⊂ *X*. For fixed *F* we have

$$\sum\_{\mathbf{x}\in F} |f\_k(\mathbf{x}) - f\_l(\mathbf{x})|^p < \mathfrak{e}^p.$$

Since the sum is finite, we may take lim*k*→∞, giving ∑*x*∈*<sup>F</sup>* | *f*(*x*)− *fl*(*x*)| *<sup>p</sup>* < ε *<sup>p</sup>*. By (B.19), the sup over all finite *F* yields: ∀ε > 0∃*K* ∈ N such that ∀*l* > *K*, we have *<sup>f</sup>* <sup>−</sup> *fl<sup>p</sup> <sup>p</sup>* <sup>&</sup>lt; <sup>ε</sup> *<sup>p</sup>*. For fixed <sup>ε</sup> and *<sup>l</sup>*, this says that *<sup>f</sup>* <sup>−</sup> *fl* <sup>∈</sup> *<sup>p</sup>*, so *<sup>f</sup>* <sup>∈</sup> *p*, because *f* = (*f* − *fl*) + *fl* with *fl* ∈ *<sup>p</sup>*, and we know that *<sup>p</sup>* is a vector space. c. *Show that fk* → *f in <sup>p</sup>*. This is contained in the previous step, since we had

$$\forall \mathfrak{e} > \mathbf{0} \exists \mathbf{K} \in \mathbb{N} \forall\_{l > K} : \|f - f\_l\|\_p < \mathfrak{e}.\tag{B.20}$$

But this is the same as lim*l*→<sup>∞</sup> *f* − *fl<sup>p</sup>* = 0, or *fl* → *f* in *p*.

The proof for *p* = ∞ is virtually the same, with (B.19) replaced by

$$||\mathbf{g}||\_{\infty} = \sup\_{F \subsetneq X} \sup\_{\mathbf{x} \in F} \{|\mathbf{g}(\mathbf{x})|\}. \tag{\mathbf{B}.21}$$

Within the finite supremum sup*x*∈*<sup>F</sup>* <sup>|</sup> *fk*(*x*)<sup>−</sup> *fl*(*x*)<sup>|</sup> <sup>&</sup>lt; <sup>ε</sup>, we may take the limit *k* → ∞ once again, followed by a supremum over *F* ⊂ *X*. -

#### B.3 Banach spaces of continuous functions

Further Banach spaces that can be defined without measure theory come from topology, notably from the class of *locally compact* spaces *X* (like N, or R*n*, etc.).

For any *f* : *X* → C, define the *support* of *f* as the closure of the set where *f* = 0.

Definition B.10. *Let X be a locally compact space. Then:*


*In general, one has the obvious inclusions*

$$\mathcal{C}\_c(X) \subseteq \mathcal{C}\_0(X) \subseteq \mathcal{C}\_b(X) \subseteq \mathcal{C}(X),\tag{B.22}$$

*with strict inclusions iff X is non-compact, and equalities iff X is compact.*

For example, if *<sup>X</sup>* <sup>=</sup> <sup>R</sup>, then *<sup>f</sup>*(*x*) = exp(−*x*2) lies in *<sup>C</sup>*0, whereas *<sup>f</sup>*(*x*) = 1 is in *Cb*. If *X* is discrete, the space *<sup>c</sup>*(*X*) and -<sup>∞</sup>(*X*) of the previous section are the same as *Cc*(*X*) and *Cb*(*X*), respectively, and we may also write -<sup>0</sup>(*X*) ≡ *C*0(*X*).

Theorem B.11. *The sets Cc*(*X*)*, C*0(*X*)*, Cb*(*X*)*, and C*(*X*) *are vector spaces under pointwise operations, and C*0(*X*) *and Cb*(*X*) *are Banach spaces in the* sup-norm

$$\|f\|\_{\infty} = \sup\_{x \in X} \{|f(x)|\}. \tag{B.23}$$

In particular, if *X* is compact, then *C*(*X*) is a Banach space in the norm (1.24)

*Proof.* Only completeness in the sup-norm (B.23) is nontrivial. We use the fact from elementary analysis that sup-norm (i.e., uniform) limits *f* of sequences (*fn*) of continuous functions exist (they are given by the pointwise limit *f*(*x*) = lim*<sup>n</sup> f*(*x*)) and are continuous. Therefore, concerning *C*0(*X*) we just need to show that the limit *f* of some sequence (*fn*) in *C*0(*X*) vanishes at infinity. Indeed, for given ε > 0, since *fn* → *f* uniformly, we can find *N* such that | *f*(*x*) − *fn*(*x*)| < ε/2 for all *x* and all *n* > *N*. Since *fn* ∈ *C*0(*X*), we can also find some compact *K* ⊂ *X* such that | *fn*(*x*)| < ε/2 for all *x* ∈/ *K* and all *n*. Hence for *x* ∈/ *K* and *n* > *N*,

$$|f(\mathbf{x})| \le |f(\mathbf{x}) - f\_n(\mathbf{x})| + |f\_n(\mathbf{x})| < \mathfrak{e}/2 + \mathfrak{e}/2 = \mathfrak{e}.\tag{\mathbf{B}.24}$$

To show that the limit *f* of a sequence (*fn*) in *Cb* is again bounded, note that for ε > 0 we have | *f*(*x*)− *fn*(*x*)| < ε for *n* > *N* and | *fn*(*x*)| < *Cn*, both for all *x*, whence

$$|f(\mathbf{x})| \le |f(\mathbf{x}) - f\_n(\mathbf{x})| + |f\_n(\mathbf{x})| < \mathbf{\varepsilon} + C\_n < \approx,\tag{B.25}$$

so *f* is bounded and hence lies in *Cb*(*X*). -

#### B.4 Basic measure theory

Measure theory studies *measure spaces* (*X*,Σ,μ), where *X* is a set, and:

	- 1. *X* ∈ Σ;
	- 2. If *<sup>A</sup>* <sup>∈</sup> <sup>Σ</sup>, then *<sup>A</sup><sup>c</sup>* <sup>∈</sup> <sup>Σ</sup> (where *Ac* <sup>≡</sup> *<sup>X</sup>*\*<sup>A</sup>* is the complement of *<sup>A</sup>*);
	- 3. If *An* ∈ Σ for *n* ∈ N, then ∪*nAn* ∈ Σ (i.e., Σ *is closed under countable unions*).

It follows that /0 ∈ Σ, and that Σ is closed under countable *intersections*, too.

• μ : Σ → [0,∞], called a (positive) *measure*, is *countably additive*, i.e.,

$$
\mu\left(\cup\_n A\_n\right) = \sum\_n \mu\left(A\_n\right), \tag{B.26}
$$

whenever *An* ∈ Σ, *n* ∈ N, *Ai* ∩*Aj* = 0 for all / *i* = *j*. The obvious convention here is that *<sup>t</sup>* <sup>+</sup><sup>∞</sup> <sup>=</sup> <sup>∞</sup> for any *<sup>t</sup>* <sup>∈</sup> <sup>R</sup>+, as well as <sup>∞</sup>+<sup>∞</sup> <sup>=</sup> <sup>∞</sup>. Countable additivity is indispensable in almost every limit argument in measure theory.

A *probability space* is a measure space (*X*,Σ,μ) for which μ(*X*) = 1. More generally, a measure space is called *finite* if μ(*X*) < ∞, which evidently implies μ(*A*) < ∞ for any *A* ∈ Σ, and σ*-finite* if *X* is a countable union *X* = ∪*nAn* with μ(*An*) < ∞ for each *n*. For example, *X* = R is σ-finite, whilst *X* = [0,1] with Lebesgue measure is finite. The non-σ-finite case is pathological and hardly occurs in practice.

This definition of a σ-algebra marks a difference with a topology on *X*, which is a collection O(*X*) of *open* subsets (containing *X* and the empty set /0) that is closed under *arbitrary* unions and *finite* intersections (but *not* under complementation!).

Nonetheless, topology and measure theory are closely related:


An important goal of measure theory is to provide a rigorous theory of *integration*; here the key idea (due to Lebesgue) is that in defining the integral of some *measurable* function *f* : *X* → R, one should partition the *range* R rather than the *domain X*, as had been done in the Calculus since Newton (where typically *<sup>X</sup>* <sup>⊆</sup> <sup>R</sup>*n*). This, in turn, suggests that *f* should first be approximated by *simple* functions.

These are *measurable* functions *<sup>s</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> with finite range, or, equivalently,

$$s = \sum\_{i} \lambda\_i 1\_{A\_i},\tag{B.27}$$

where λ*<sup>i</sup>* ≥ 0, *Ai* ∈ Σ, and *n* < ∞. Such a representation is unique if we require that the sets *Ai* are mutually disjoint and the coefficients λ*<sup>i</sup>* are distinct; namely, if {*x*1,..., *xn*} are the distinct values of *<sup>s</sup>*, one takes *Ai* <sup>=</sup> *<sup>s</sup>*−1(*xi*) and <sup>λ</sup>*<sup>i</sup>* <sup>=</sup> *xi*. Given some measure μ, we further restrict the class of simple functions to those for which μ(*Ai*) < ∞. One then first defines the integral of a simple function *s*, as in (B.27), by

$$\int\_{X} d\mu \, s = \sum\_{i} \lambda\_{i} \mu \left( A\_{i} \right);\tag{B.28}$$

a nontrivial argument shows that the right-hand side is independent of the particular representation (B.27) of *s* used on the left. Granting this, linearity of the integral on simple functions is immediate. Subsequently, for *positive* measurable functions *f* ≥ 0, writing *s* ≤ *f* iff *s*(*x*) ≤ *f*(*x*) for each *x* ∈ *X*, one defines the integral by

$$\int\_{X} d\mu \, f = \sup \left\{ \int\_{X} d\mu \, s \mid 0 \le s \le f, s \text{ simple} \right\}. \tag{B.29}$$

For measurable functions *f* : *X* → C, one first decomposes *f* as

$$f = \sum\_{k=0}^{3} i^k f\_k, \ f\_k \ge 0,\tag{B.30}$$

where, writing *f* = Re(*f*) + *i*Im(*f*) ≡ *f* + *i f* , *f*<sup>0</sup> ≡ *f* <sup>+</sup>, *f*<sup>2</sup> ≡ *f* <sup>−</sup>, *f*<sup>1</sup> ≡ *f* <sup>+</sup>, and *f*<sup>3</sup> ≡ *f* <sup>−</sup>, so that *f* • = *f* • <sup>+</sup> − *f* • <sup>−</sup> for • = , one may take *f* • <sup>±</sup> <sup>=</sup> <sup>1</sup> <sup>2</sup> (| *f* •| − *f* •).

On this basis, one then defines the integral by linear extension of (B.29), that is,

$$\int\_{X} d\mu \, f = \sum\_{k=0}^{3} i^{k} \int\_{X} d\mu \, f\_{k}. \tag{B.31}$$

We call *<sup>f</sup> integrable* with respect to <sup>μ</sup>, writing *<sup>f</sup>* <sup>∈</sup> <sup>L</sup> <sup>1</sup>(*X*,Σ,μ), if

$$\int\_{X} d\mu \, |f| < \ast;\tag{B.32}$$

this implies that each positive part *fk*, and hence also *f* itself, is integrable, i.e.,

$$\int\_{X} d\mu \, f < \Leftrightarrow. \tag{B.33}$$

However, (B.33) does not imply (B.32). From (B.32) one has the useful estimates

$$\left| \int\_{X} d\mu \, f \right| \le \int\_{X} d\mu \, |f| \le \|f\|\_{\infty}^{\mathrm{ess}} \mu(X),\tag{B.34}$$

where the *essential supremum* of *f* (with respect to μ) is defined by

$$\|\|f\|\|\_{\ast\ast}^{\mathrm{ess}} = \inf\{t \in [0,\ast\ast] \mid |f| \le t \text{ } \mu\text{-almost everywhere}\},\tag{\text{B.35}}$$

where | *f* | ≤ *t* μ-a.e. means that μ({*x* ∈ *X* | | *f*(*x*) > *t*}) = 0}. In (B.34), the expressions *<sup>f</sup>* ess <sup>∞</sup> and/or μ(*X*) may well be infinite (in which case the second estimate still holds, of course!). However, if *X* is a locally compact space (see the next section), μ is finite, and *f* ∈ *C*0(*X*) or even *f* ∈ *Cb*(*X*), then all of (B.34) is finite.

Linearity of the integral is far from trivial: the proof relies on linearity for simple functions, as well as on a fundamental approximation lemma:

Lemma B.12. *If f* ≥ 0 *is measurable, there is a monotone increasing sequence of simple functions sn, i.e., such that* 0 ≤ *s*<sup>1</sup> ≤ *s*<sup>2</sup> ≤···≤ *sn* ≤ *sn*+<sup>1</sup> ≤···≤ *f pointwise, for which sn* → *f pointwise (i.e.,* lim*n*→<sup>∞</sup> *sn*(*s*) = *f*(*x*) *for each x* ∈ *X ).*

Furthermore, one needs one of the two great convergence theorems of measure theory named after Lebesgue, both of which (for future use) we now state. In these theorems (as well as in many others), we say that a measurable functions *f* : *X* → C has some property μ*-almost everywhere (*μ*-a.e.)* if the set where *f* does *not* have the said property has measure zero. For example *f* = 0 μ-a.e. means that *f*(*x*) = 0 for each *x* ∈/ *N*, for some measurable set *N* with μ(*N*) = 0 (as they say, "morally", the behaviour of measurable functions on subsets of measure zero should not matter).

Theorem B.13. *Let* (*fn*) *be a sequence of (complex-valued) measurable functions.*

*1.* Dominated Convergence*: if* (*fn*) *converges pointwise* μ*-a.e. to some function f and* <sup>|</sup> *fn*(*x*)| ≤ *<sup>g</sup>*(*x*) <sup>μ</sup>*-a.e. for some g* <sup>∈</sup> <sup>L</sup> <sup>1</sup>(*X*,Σ,μ)*, then f* <sup>∈</sup> <sup>L</sup> <sup>1</sup>(*X*,Σ,μ)*, and*

$$\lim\_{n \to \infty} \int\_X d\mu \, f\_n = \int\_X d\mu \, f. \tag{B.36}$$

*2.* Monotone Convergence*: if fn* ≥ 0 *and* (*fn*) *is monotone increasing* μ*-a.e., and*

$$\sup\_{n} \left\{ \int\_{X} d\mu \, f\_{n} \right\} < \infty,\tag{B.37}$$

*then* lim*n*→<sup>∞</sup> *fn*(*x*) <sup>≡</sup> *<sup>f</sup>*(*x*) *exists* <sup>μ</sup>*-a.e., f* <sup>∈</sup> <sup>L</sup> <sup>1</sup>(*X*,Σ,μ)*, and* (B.36) *holds.*

Note that the first *conclusion* of the monotone convergence theorem is an *assumption* in the dominated one! Either way, the fact that the pointwise limit function *f* is integrable, being implicit in the notation *<sup>f</sup>* <sup>∈</sup> <sup>L</sup> <sup>1</sup>(*X*,Σ,μ), is part of the *result*.

Corollary B.14. *Integration is linear, i.e., if f*1*, f*<sup>2</sup> *are integrable and* λ1,λ<sup>2</sup> ∈ C*,*

$$
\int\_X d\mu \left(\lambda\_1 f + \lambda\_2 f\_2\right) = \lambda\_1 \int\_X d\mu \, f\_1 + \lambda\_2 \int\_X d\mu \, f\_2. \tag{B.38}
$$

*Proof.* If *f*<sup>1</sup> ≥ 0, *f*<sup>2</sup> ≥ 0, let *s* (1) *<sup>n</sup>* → *f*<sup>1</sup> and *s* (2) *<sup>n</sup>* → *f*2, as in Lemma B.12. Then the conditions of the monotone convergence theorem hold, because integration is itself a monotone operation (i.e., if *<sup>f</sup>* <sup>≤</sup> *<sup>g</sup>*, then *<sup>X</sup> <sup>d</sup>*<sup>μ</sup> *<sup>f</sup>* <sup>≤</sup> *<sup>X</sup> d*μ *g*). Combined with linearity on simple functions (as already established above), this yields the claim. -

#### B.5 Measure theory on locally compact Hausdorff spaces

For us it suffices to deal with *locally compact Hausdorff spaces X*. Our main goal is Corollary B.21. We say that a map ϕ :*C*(*X*) → C is *positive* if ϕ(*f*) ≥ 0 whenever *f* ≥ 0 (pointwise). We also write O(*X*) for the set of *open* subsets of *X*, whilst K (*X*) denotes the set of all *compact* subsets of *X*. We first assume that *X* is compact. Any finite measure μ : B(*X*) → [0,∞) gives rise to a positive linear map ϕ : *C*(*X*) → C,

$$\mathfrak{q}(f) = \int\_X d\mu \, f, \, f \in \mathcal{C}(X). \tag{B.39}$$

Conversely, any such map canonically defines a finite measure μ at least on opens *U* ∈ O(*X*) and on compacta *K* ∈ K (*X*) (which are key examples of Borel sets) by

$$\mu(U) = \sup \{ \mathfrak{\varphi}(f) \mid f \in \mathcal{C}\_{\mathfrak{c}}(U), 0 \le f \le 1\_X \};\tag{B.40}$$

$$\mu(K) = \inf \{ \mathfrak{q}(f) \mid f \in \mathcal{C}\_c(X), 0 \le f \le 1\_X, f\_{|K} = 1\_K \}. \tag{B.41}$$

Subsequently, this preliminary measure is (hopefully!) to be extended to at least all of B(*X*), i.e., to all Borel sets, in such a way that μ recovers ϕ via (B.39).

This works, and one even obtains a bijective correspondence between finite measure spaces (*X*,Σ,μ) and positive linear maps ϕ : *C*(*X*) → C if the former are subjected to two additional conditions, predicated on having B(*X*) ⊂ Σ, namely:


$$
\mu^\*(A) = \mu\_\*(A) = \mu(A), \tag{B.42}
$$

where the *outer measure* μ<sup>∗</sup> and *inner measure* μ<sup>∗</sup> are defined by

$$\mu^\*(A) = \inf \{ \mu(U) \mid U \supseteq A, U \in \mathcal{O}(X) \};\tag{B.43}$$

$$\mu\_\*(A) = \sup \{ \mu(K) \mid K \subseteq A, K \in \mathcal{K}(X) \},\tag{B.44}$$

respectively. These expressions apparently make sense for *all* subsets *A* ⊂ *X*, but lovers of the Banach–Tarski Paradox may be reassured that μ<sup>∗</sup> and μ<sup>∗</sup> typically fail to be countable additive if they are seen as maps from P(*X*) to [0,∞]. For future reference we also define (*X*,Σ,μ) to be *inner regular* if (merely) μ∗(*A*) = μ(*A*) for *A* ∈ Σ, and *outer regular* if (merely) μ∗(*A*) = μ(*A*), *A* ∈ Σ. So *a regular measure is both inner and outer regular*. We are now in a position to state the *Riesz Representation Theorem* (often attributed also to *Radon*).

Theorem B.15. *Let X be a compact Hausdorff space. There is a bijective correspondence between* complete regular finite measure spaces (*X*,Σ,μ) *and* positive linear maps ϕ : *C*(*X*) → C*, explicitly given as follows:*


We omit the lengthy proof, expect by announcing that Theorem B.15 may be seen as a special case of the more advanced Choquet theory reviewed in §B.11. For now, just note that expressions like (B.40) and (B.41) are really desperate attempts to define "μ(*A*) = ϕ(1*A*)", which is OK for finite *X*, but in general is ill defined because even for Borel sets *A*, the characteristic function 1*<sup>A</sup>* is rarely continuous on *X*.

We note that μ has to be finite, since obviously μ(*X*) = ϕ(1*<sup>X</sup>* ). One can say a little more about this. A linear map ϕ :*C*(*X*) → C is *bounded* if, for some 0 <*C* < ∞,

$$|\mathfrak{g}(f)| \le C \|f\|\_{\infty} \text{ ( $f \in C(X)$ .} \tag{B.45}$$

In that case, the following expression, called the *norm* of ϕ, is ≤ *C*, hence finite:

$$\|\|\boldsymbol{\varphi}\|\| = \sup\{ |\boldsymbol{\varphi}(f)|, f \in C(X), \|f\|\_{\infty} = 1 \}.\tag{B.46}$$

Proposition B.16. *Let X be a compact Hausdorff space. If a linear map* ϕ :*C*(*X*) → C *is positive, then it is bounded, with norm*

$$\|\!\!\|\!\!\varphi\|\!\!\| = \!\!\varphi(1\_X). \tag{\mathbf{B}.47}$$

*Proof.* Positivity makes *f*,*g* = ϕ(*f* <sup>∗</sup>*g*) a pre-inner product on *C*(*X*), so by (A.1) with *v* = 1*<sup>X</sup>* and *w* = *f* , we find |ϕ(*f*)| <sup>2</sup> <sup>≤</sup> <sup>ϕ</sup>(<sup>|</sup> *<sup>f</sup>* <sup>|</sup> <sup>2</sup>)ϕ(1*<sup>X</sup>* ) for any *<sup>f</sup>* . If *<sup>f</sup>* <sup>∞</sup> <sup>=</sup> 1, then pointwise 0 ≤ | *f* | <sup>2</sup> <sup>≤</sup> <sup>1</sup>*<sup>X</sup>* , so by positivity, <sup>ϕ</sup>(<sup>|</sup> *<sup>f</sup>* <sup>|</sup> <sup>2</sup>) <sup>≤</sup> <sup>ϕ</sup>(1*<sup>X</sup>* ). Hence <sup>|</sup>ϕ(*f*)| ≤ <sup>ϕ</sup>(1*<sup>X</sup>* ), so that ϕ ≤ ϕ(1*<sup>X</sup>* ). Finally, taking *f* = 1*<sup>X</sup>* in (B.46) gives equality. -

A *state* on *C*(*X*) is a positive linear functional ω : *C*(*X*) → C with ω(1*<sup>X</sup>* ) = 1.

Corollary B.17. *If X is a compact Hausdorff space, there is a bijective correspondence between states on C*(*X*) *and complete regular probability measures on X.*

We now move to the next case in difficulty, where *X* is assumed to be σ*-compact*, in being a countable union of compact sets, i.e., *X* = ∪*nKn*, where *Kn* ∈ K (*X*). Using a little topology, this is actually equivalent to *X* being a perhaps more appealing union *X* = ∪*nUn*, where each *Un* is open with compact closure *U*<sup>−</sup> *<sup>n</sup>* , and *U*<sup>−</sup> *<sup>n</sup>* ⊆ *Un*+1. This, in turn, implies that *X* = ∪*nK <sup>n</sup>* with *K <sup>n</sup>* ⊆ *K <sup>n</sup>*+<sup>1</sup> all compact. If (*X*,μ,σ) is a measure space where *X* is σ-compact topologically, B(*X*) ⊆ Σ, and

$$
\mu(K) < \approx, \ (K \in \mathcal{X}'(X)), \tag{B.48}
$$

then *X* is also σ-finite measure-theoretically. Since these are the only σ-finite measure spaces we will consider, with a slight change in terminology we call a locally compact measure space (*X*,Σ,μ) σ*-finite* if it is also σ-compact and (B.48) holds.

The new point compared to the compact case is that functionals like the above ϕ should now be defined on the space *Cc*(*X*) of continuous functions on *X with compact support*. Otherwise, Theorem B.15 may be repeated almost *verbatim*:

Theorem B.18. *Let X be a* σ*-compact Hausdorff space. There is a bijective correspondence between* complete regular σ-finite measure spaces (*X*,Σ,μ) *and* positive linear maps ϕ : *Cc*(*X*) → C*, explicitly given as in Theorem B.15.*

For the sake of completeness we also state Theorem B.18 in the case where *X* is not even assumed to be σ-compact. In that case, inner regularity may be lost:

Theorem B.19. *Let X be a locally compact Hausdorff space. There is a bijective correspondence between* complete outer regular measure spaces (*X*,Σ,μ) *satisfying* (B.48)*, and* positive linear maps ϕ :*Cc*(*X*) → C*, explicitly given as in Theorem B.15, except for the fact that* Σ *now consists of all A* ∈ P(*X*) *for which* μ(*A*∩*K*) < ∞ *and* μ∗(*A*∩*K*) = μ(*A*∩*K*) *for any K* ∈ K (*X*)*. In that case,* μ *is* defined *by*

$$
\mu(A) = \mu^\*(A), \ A \in \Sigma. \tag{B.49}
$$

However, this generality will not really be needed for our purposes, which will only require *finite* measures, in which case outer regularity implies regularity.

In order to generalize Corollary B.17 to the σ-compact case, or even to the locally compact case, we must involve the Banach spaces *Cc*(*X*) and *C*0(*X*) of the previous section. Also for linear maps ϕ : *Cc*(*X*) → C or ϕ : *C*0(*X*) → C we use the notation (B.46), where now the supremum is taken over *f* ∈ *Cc*(*X*) and *f* ∈ *C*0(*X*), respectively. For example, in the latter case, provided (B.45) holds, we have

$$||\mathfrak{g}|| = \sup\{ |\mathfrak{g}(f)|, f \in \mathcal{C}\_0(X), ||f||\_{\simeq} = 1 \}.\tag{B.50}$$

Lemma B.20. *Let X be a locally compact Hausdorff space.*

	- *a.* ϕ *is bounded, as in* (B.45)*;*
	- *b.* ϕ *can be extended to a* positive *linear map* ϕ : *C*0(*X*) → C*.*

*In particular, a positive linear map* ϕ : *C*0(*X*) → C *is automatically bounded.*

*Proof.* 1. The first claim means either of the following two equivalent properties:


We prove both. For some given *f* ∈*C*<sup>0</sup> and ε > 0, find the usual compact *K* such that | *f*(*x*)| < ε outside *K*. Urysohn's Lemma gives *h* ∈ *Cc*(*X*) with 0 ≤ *h*(*x*) ≤ 1 for all *x* ∈ *X* and *h*(*x*) = 1 for all *x* ∈ *K*. Take *g* = *f h* ∈*Cc*(*X*), so that *f* −*g*<sup>∞</sup> < ε. For ε = 1/*n*, rename the *g* thus constructed as *fn*. Then *f* − *fn*<sup>∞</sup> → 0.

2. To go from 2.a to 2.b, using the previous item, let *fn* → *f* uniformly (i.e., in the sup-norm), and define the extension ϕ : *C*0(*X*) → C by ϕ(*f*) = lim*<sup>n</sup>* ϕ(*fn*). This limit exists, since |ϕ(*fm*)−ϕ(*fn*)| ≤*C fm* − *fn*∞, so that, (*fn*) being convergent and hence Cauchy in *C*0(*X*), the sequence (ϕ(*fn*)) is Cauchy in C. The value ϕ(*f*) is easily verified to be independent of the approximating sequence (*fn*). Finally, the approximation in 2.a preserves positivity, i.e., if *f* ≥ 0 then *fn* ≥ 0, so also ϕ(*f*) ≥ 0, as it has been defined as the limit of a positive sequence.

By definition, the converse implication 2.b → 2.a is equivalent to the claim that

$$\sup\{|\mathfrak{q}(f)|, f \in C\_0(X), \|f\|\_{\infty} \le 1\} < \infty,\tag{B.S1}$$

which in turn is equivalent to the apparently weaker claim to the effect that

$$\sup \{ |\mathfrak{g}(f\_n)|, n \in \mathbb{N} \} < \infty,\tag{B.52}$$

for any sequence (*fn*) with *fn* ≤ 1. Indeed, if the first supremum were infinite, then for each *n* ∈ N there is *fn* such that |ϕ(*fn*)| > *n*, and (B.52) could not possibly hold. Furthermore, (B.52) need only hold for *non-negative* functions *fn* ≥ 0 (still with *fn* ≤ 1, of course) cf. (B.31), since |ϕ(*fk*)| = ϕ(*fk*) < *C* for each *k* = 0,...,3 implies |ϕ(*f*)| < 4*C*. And this, finally, reduces to the claim that

$$\sum\_{n=1}^{\infty} \mathbf{g}(n)\boldsymbol{\varrho}(f\_n) < \infty, \ \forall \mathbf{g} \in \ell^1(\mathbb{N}), \mathbf{g}(n) \ge 0. \tag{\text{B.S.3}}$$

Namely, if the sequence (ϕ(*fn*)) where unbounded, it would be trivial to find such a summable function *g* for which the sum in (B.53) diverges (for example, take a subsequence for which ϕ(*fnm* ) > *m* and take *g* such that *gnm* = 1/*m*2).

To prove (B.53), then, given that *fn* ≥ 0 and hence ϕ(*fn*) ≥ 0, with *fn* ≤ 1, first note that ∑*<sup>n</sup> g*(*n*)*fn* converges in *C*0(*X*) (since it is obviously absolutely convergent, and any absolutely convergent series in a Banach space converges). Calling the sum *h*, for any *N* < ∞ we have ∑*<sup>N</sup> <sup>n</sup>*=<sup>1</sup> *g*(*n*)*fn* ≤ *h* and hence, by positivity of ϕ, also ∑*<sup>N</sup> <sup>n</sup>*=<sup>1</sup> *g*(*n*)ϕ(*fn*) ≤ ϕ(*h*) < ∞. Letting *N* → ∞ gives (B.53). -

We now define a *state* on *C*0(*X*) as a positive (and hence bounded) linear functional ω :*C*0(*X*) → C with ω = 1; this is consistent with the terminology for the compact case because of (B.47), as well as with the terminology for C\*-algebras.

Corollary B.21. *Let X be a locally compact Hausdorff space. There is a bijective correspondence between positive linear functionals on C*0(*X*) *and complete regular finite measures on X, explicitly given as in the bullet points of Theorem B.15.*

*In particular,* states *on C*0(*X*) *correspond to regular* probability measures *on X.*

*Proof.* All that remains to be shown is that, under (B.39), we have

$$\|\|\boldsymbol{\varrho}\|\| = \mu(X),\tag{B.54}$$

so that, in particular, the case ϕ = 1 corresponds to μ(*X*) = 1. For compact *X*, eq. (B.54) is immediate from (B.47). For locally compact *X*, we immediately see from (B.39) and (B.50) that ϕ ≤ μ(*X*). To saturate this inequality, we use inner regularity of the measure μ corresponding to ϕ, cf. Theorem B.19 and subsequent comment. From (B.42) and (B.44), for any ε > 0 we can find *K* ∈ K (*X*) with μ(*X*)−μ(*K*) < ε. Now use Urysohn's Lemma to find *f* ∈*Cc*(*X*) such that 0 ≤ *f* ≤ 1 and *f*|*<sup>K</sup>* = 1. Then ϕ(*f*) ≥ μ(*K*), and, letting ε → 0, eq. (B.54) follows. -

Finally, we extend the above corollaries to the entire (Banach) dual *C*0(*X*)∗, i.e., the space of *all* (i.e. not necessarily positive) bounded linear maps ϕ : *C*0(*X*) → C, equipped with the norm (B.50). As we shall see more generally in §B.9, this is a vector space (under pointwise operations) and even a Banach space in its own right.

From the point of view of measure theory, the relevant concept is that of a *complex measure*. This is a map μ : Σ → C satisfying the countable additivity condition (B.26), as in the positive case. In the complex case this condition implies that μ is finite. One then (trivially) has a decomposition μ = μ +*i*μ, where μ and μ are countably additive maps Σ → R (just take μ = <sup>1</sup> <sup>2</sup> (μ + μ∗) and μ = −<sup>1</sup> <sup>2</sup> *i*(μ − μ∗), where μ∗(*A*) = μ(*A*)), and (nontrivially) has the *(Hahn)–Jordan decomposition*:

Theorem B.22. *Let* Σ *be a* σ*-algebra on a set X and let* μ *be a (finite)* signed measure*, i.e.,a countably additive map* Σ → R*. Then there is a unique decomposition*

$$
\mu = \mu\_+ - \mu\_-, \tag{B.55}
$$

*where the measures* <sup>μ</sup><sup>±</sup> : <sup>Σ</sup> <sup>→</sup> <sup>R</sup><sup>+</sup> *are given by:*

$$
\mu\_+(A) = \sup \{ \mu(B) \mid B \subseteq A, B \in \Sigma \}; \tag{B.56}
$$

$$\mu\_{-}(A) = -\inf\{\mu(B) \mid B \subseteq A, B \in \Sigma\},\tag{B.S7}$$

*and* μ<sup>+</sup> *and* μ<sup>−</sup> *are* mutually singular *in that there is a set N* ∈ Σ *such that*

$$
\mu\_+(N) = \mu\_-(X \backslash N) = 0. \tag{B.58}
$$

We will not prove this, just noting that in terms of the *total variation* |μ| of μ, i.e.,

$$|\mu|(A) = \sup \left\{ \sum\_{n \in \mathbb{N}} |\mu(A\_n)| \right\},\tag{B.59}$$

where the supremum is taken over all measurable partitions *A* = ∪*nAn*, one has

$$
\mu\_{\pm} = \frac{1}{2} (|\mu| \pm \mu). \tag{B.60}
$$

Fom the point of view of C\*-algebras it is more natural to start from bounded linear functionals on *C*0(*X*). First, we call a map ϕ : *C*0(*X*) → C *hermitian* if

$$
\overline{\mathfrak{g}(f^\*)} = \overline{\mathfrak{g}(f)} \ (f^\*(\mathfrak{x}) \equiv \overline{f(\mathfrak{x})}).\tag{\mathbb{B}.61}
$$

Theorem B.23. *1. Any functional* ϕ ∈ *C*0(*X*)<sup>∗</sup> *has a unique decomposition*

$$
\boldsymbol{\varphi} = \boldsymbol{\varphi}' + i\boldsymbol{\varphi}'',\tag{\text{B.62}}
$$

*where the functionals* ϕ ∈ *C*0(*X*)<sup>∗</sup> *and* ϕ ∈ *C*0(*X*)<sup>∗</sup> *are hermitian. 2. Any* hermitian *functional* ϕ ∈ *C*0(*X*)<sup>∗</sup> *has a decomposition*

$$
\boldsymbol{\mathfrak{\boldsymbol{\phi}}} = \boldsymbol{\mathfrak{\boldsymbol{\phi}}}\_{+} - \boldsymbol{\mathfrak{\boldsymbol{\phi}}}\_{-}, \tag{B.63}
$$

*where the functionals* ϕ<sup>±</sup> ∈ *C*0(*X*)<sup>∗</sup> *are positive, and are given on f* ≥ 0 *by*

$$\mathfrak{g}\_+(f) = \sup \{ \mathfrak{g}(\mathfrak{g}), \mathfrak{g} \in C\_0(X), 0 \le \mathfrak{g} \le f \};\tag{\text{B.64}}$$

$$\mathfrak{G}\_{-}(f) = -\inf\{\mathfrak{q}(h), h \in \mathcal{C}\_{0}(X), 0 \le h \le f\}.\tag{B.65}$$

*3. These expressions satisfy*

$$\|\|\boldsymbol{\varrho}\|\| = \|\|\boldsymbol{\varrho}\_{+}\|\| + \|\|\boldsymbol{\varrho}\_{-}\|\|,\tag{B.66}$$

*and any positive functionals* ϕ<sup>±</sup> ∈ *C*0(*X*)<sup>∗</sup> *that satisfy* (B.63) *as well as* (B.66) *are necessarily given by* (B.64) *-* (B.65)*.*

*4. Any functional* ϕ ∈ *C*0(*X*)<sup>∗</sup> *is a linear combination of at most four states.*

*Proof.* 1. Take ϕ = <sup>1</sup> <sup>2</sup> (ϕ +ϕ∗) and ϕ = −<sup>1</sup> <sup>2</sup> *i*(ϕ −ϕ∗), where ϕ∗(*f*) = ϕ(*f* <sup>∗</sup>).


$$\|\|\boldsymbol{\pm}\|\| \le \|\|\boldsymbol{\pm}\_{+}\|\| + \|\|\boldsymbol{\pm}\_{-}\|\| = \boldsymbol{\pm}\_{+}(1\_X) + \boldsymbol{\varphi}\_{-}(1\_X) \tag{B.67}$$

$$0 = \sup\{\mathfrak{g}(\mathfrak{g}), 0 \le \mathfrak{g} \le 1\_X\} - \inf\{\mathfrak{g}(h), 0 \le h \le 1\}.\tag{B.68}$$

For any ε > 0, there is *g* such that ϕ(*g*) is close to the supremum in (B.68) by 1 <sup>2</sup> ε, and likewise there is *h* such that ϕ(*h*) is close to the infimum in (B.68) by the same amount, so that

$$|\mathfrak{q}\_{+}(1\_X) + \mathfrak{q}\_{-}(1\_X) - \mathfrak{q}(\mathfrak{g} - h)| < \mathfrak{e}.\tag{\text{B.69}}$$

Since 0 ≤ *g* ≤ 1*<sup>X</sup>* and 0 ≤ *h* ≤ 1, we have *g*−*h* ≤ 1, and thereore

$$
\mathfrak{g}(\mathfrak{g} - h) \le \|\mathfrak{g}\| \|\|\mathfrak{g} - h\| \le \|\mathfrak{g}\|.\tag{\text{B.70}}
$$

Hence (B.67) gives

$$\|\|\boldsymbol{\varrho}\|\| \le \|\|\boldsymbol{\varrho}\_{+}\|\| + \|\|\boldsymbol{\varrho}\_{-}\|\| \le \|\|\boldsymbol{\varrho}\|\| + \mathfrak{e},\tag{B.71}$$

so letting ε → 0 yields (B.66).

For locally compact *X*, we reduce the proof to the compact case by forming the *one-point compactification <sup>X</sup>*˙ of *<sup>X</sup>*, cf. §C.6. As a set, this is *<sup>X</sup>*˙ <sup>=</sup> *<sup>X</sup>* ∪ {∞}, where ∞ is a singleton. As a space, the open sets in *X*˙ are the open sets in *X* plus those subsets of *<sup>X</sup>*˙ whose complement is compact in *<sup>X</sup>*. The obvious injection *<sup>i</sup>* : *<sup>X</sup>* <sup>→</sup> *<sup>X</sup>*˙ is continuous, and any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*X*) extends uniquely to a function *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*˙) that vanishes at the compactification point, i.e., *f*(∞) = 0. This yields an *isometric* embedding *<sup>C</sup>*0(*X*) <sup>→</sup> *<sup>C</sup>*(*X*˙). Furthermore, as vector spaces one has

532 B Basic functional analysis

$$\mathbf{C}(\dot{X}) = \mathbf{C}\_0(X) \oplus \mathbb{C} \cdot \mathbf{1}\_{\hat{X}}.\tag{\mathbb{B}.72}$$

Any linear map ϕ on *C*0(*X*) may then be extended to a linear map ϕ˙ on *C*(*X*˙) via

$$\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\chi}}}}}}}(f + \lambda \mathbf{1}\_{\vec{X}}) = \mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{\mathfrak{?}}}}}}}}}}}}}}}}}}}} $$

From the point of view of (B.39), this extension may alternatively be described as follows: extend the measure μ on *X* that underlies ϕ to a measure μ˙ on *X*˙ by μ˙(*A*∪ {∞}) = μ(*A*), *A* ∈ Σ. This shows that ϕ˙ remains positive when ϕ is, and using (B.54) and the analogue of (B.47) for *X*˙ instead of *X*, we also obtain

$$\|\dot{\Phi}\| = \dot{\Phi}(1\_{\dot{X}}) = \dot{\mu}(\dot{X}) = \mu(X) = \|\varphi\|. \tag{B.74}$$

One may then repeat the proof of the compact case, using ϕ˙ instead of ϕ. We just prove uniqueness for the compact case (in general, add dots as in the previous proof). Suppose ϕ = ϕ <sup>+</sup> −ϕ <sup>−</sup>. For *<sup>f</sup>* <sup>≥</sup> 0, using (B.64) and <sup>ϕ</sup> <sup>−</sup>(*g*) ≥ 0,

$$\begin{aligned} \mathfrak{q}\_+(f) &= \sup \{ \mathfrak{q}'\_+(\mathfrak{g}) - \mathfrak{q}'\_-(\mathfrak{g}), 0 \le \mathfrak{g} \le f \} \\ &\le \sup \{ \mathfrak{q}'\_+(\mathfrak{g}), 0 \le \mathfrak{g} \le f \} \le \mathfrak{q}'\_+(f), \end{aligned}$$

so ψ ≡ ϕ <sup>+</sup> − ϕ<sup>+</sup> ≥ 0. With ϕ <sup>±</sup> <sup>=</sup> <sup>ϕ</sup><sup>±</sup> <sup>+</sup> <sup>ψ</sup>, imposing ϕ <sup>=</sup> <sup>|</sup><sup>ϕ</sup> <sup>+</sup> + ϕ − and repeatedly using (B.47), we find ψ = 0, and hence ψ = 0.

4. This is trivial from parts 1–2, noting that any nonzero positive functional ϕ = *t*ω is a multiple of a state ω = ϕ/ϕ, with *t* = ϕ, since obviously ω = 1.

Combining this proposition with Corollaries B.17 and B.21, we finally obtain:

Theorem B.24. *Let X be a locally compact Hausdorff space. The Banach dual C*0(*X*)<sup>∗</sup> *of all bounded linear maps* ϕ : *C*0(*X*) → C *is isometrically isomorphic with the space M*(*X*) *of all complete regular complex measures* μ *on X, with norm*

$$\|\mu\| = |\mu|(X). \tag{B.75}$$

*In particular, if* μ *is real (i.e., hermitian as a functional on C*(*X*)*), then (cf.* (B.55)*)*

$$\|\mu\| = \mu\_+(X) + \mu\_-(X). \tag{B.76}$$

This implies Corollary B.21, including its crucial final claim to the effect that states on *C*0(*X*) correspond to regular probability measures on *X*.

We briefly sketch an analogous result for *finitely additive measures*. Instead of a σ-algebra of subsets of some set *X*, we now start from a so-called *semiring*:

Definition B.25. *A* semiring *of subsets of X is a family* R ⊆ P(*X*) *such that:*


In fact, in all our examples a stronger version of axiom 3 holds: if *A*,*B* ∈ R and *B* ⊂ *A*, then *A*\*B* ∈ R. Indeed, we will typically have *X* = N and either R = P(N) or R = P*f*(N) (i.e. the collection of *finite* subsets of N).

Using the *fundamental lemma for semirings*, which states that if *A*1,...,*An* ∈ R, there are finitely many pairwise disjoint *B*1,...,*Bm* in R such that ∪*nAn* = ∪*mBm*, it can be shown that the complex linear span Step(*X*,R) of the characteristic functions 1*<sup>A</sup>* (*A* ∈ R) is a commutative algebra under obvious pointwise operations. Since functions on Step(*X*,R) are bounded, we may form the closure of Step(*X*,R) in the supremum-norm; adding pointwise complex conjugation this yields a commutative C\*-algebra called -<sup>∞</sup>(*X*,R) (which has a unit iff *<sup>X</sup>* <sup>=</sup> <sup>∪</sup>R). For example, we have

$$\ell^{\infty}(\mathbb{N}, \mathcal{P}(\mathbb{N})) = \ell^{\infty}(\mathbb{N}) \equiv \ell^{\infty};\tag{\mathbb{B}.77}$$

$$\ell^{\infty}(\mathbb{N}, \mathcal{P}\_f(\mathbb{N})) = \ell\_0(\mathbb{N}) \equiv c\_0. \tag{B.78}$$

Definition B.26. *A* finitely additive measure *on* (*X*,R) *is a map* μ : R → [0,∞] *such that* μ(*A*∪*B*) = μ(*A*) + μ(*B*) *whenever A*,*B* ∈ R*, A*∪*B* ∈ R*, and A*∩*B* = 0/*.*

Similarly, we have finitely additive *signed* measures taking values in R, which admit a Jordan–Hahn decomposition (B.55) with (B.56) - (B.57), just as in the σ-additive case. We say that a finitely additive signed measure μ is *finite* if |μ(*A*)| < ∞ for each *A* ∈ R, and *bounded* if sup{|μ(*A*)|,*A* ∈ R} < ∞. With |μ| = μ<sup>+</sup> +μ−, the bounded finitely additive signed measures form a real Banach space ba(*X*,R) in the norm

$$\|\|\mu\|\| = \sup\{ |\mu|(A), A \in \mathcal{A} \}. \tag{B.79}$$

Within this space, the *probability measures* stand out as those measures μ that take values in [0,1] (so that μ = μ+) and satisfy μ = 1.

Functions in Step(*X*,R) may be integrated against measures in ba(*X*,R) in the obvious way, cf. (B.27) - (B.28). This is well defined, and one easily infers that

$$|\int\_{X} d\mu \, s| \le ||\mu|| ||s||\_{\infty},\tag{B.80}$$

for any *s* ∈ Step(*X*,R). Hence we may extend the integral to any *f* ∈ -<sup>∞</sup>(*X*,R) by

$$\int\_{X} d\mu \, f = \lim\_{n \to \infty} \int\_{X} d\mu \, s\_n,\tag{B.81}$$

where (*sn*) is any sequence in Step(*X*,R) converging to *f* in the sup-norm ·∞. This is well defined by the usual arguments. At the end of the day, we obtain:

Theorem B.27. *Let X be a set equipped with some semiring* R ⊆ P(*X*)*.*


## B.6 *L<sup>p</sup>* spaces

We return to the usual, countably additive setting for measure theory. In the previous section, the notion of a measure space (*X*,Σ,μ) has mainly been used to provide an integration theory for *continuous* functions on *X*, though (B.29) suggested greater generality. In what follows, we keep the restriction to locally compact spaces *X* (although the theory is more general), but we expand the class of functions that can be integrated over *X* "against the measure μ". This, then, leads to an important class of Banach spaces, called *<sup>L</sup>p*(*X*) <sup>≡</sup> *<sup>L</sup>p*(*X*,Σ,μ); some authors write *<sup>L</sup>p*(*X*,Σ), others *<sup>L</sup>p*(μ). One may have examples like *<sup>X</sup>* <sup>=</sup> <sup>Ω</sup> <sup>⊂</sup> <sup>R</sup>*<sup>n</sup>* in mind, with <sup>Ω</sup> measurable (typically open or closed, like *X* = R*<sup>n</sup>* or *X* = [0,1]), and μ being Lebesgue measure. On the other hand, one may think of *X* as a discrete space with counting measure (i.e., <sup>μ</sup>({*x*}) = 1 for each *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*), in which case the space *<sup>L</sup>p*(*X*) will reduces to the space *<sup>p</sup>*(*X*) we already know; the typical case will be *X* = N.

Definition B.28. *Given a measure space* (*X*,Σ,μ) *and a real number* 1 ≤ *p* ≤ ∞*:*

• *For* <sup>1</sup> <sup>≤</sup> *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>*, the set* <sup>L</sup> *<sup>p</sup>*(*X*) <sup>≡</sup> <sup>L</sup> *<sup>p</sup>*(*X*,Σ,μ) *consists of all of measurable functions f* : *X* → C *that are* essentially bounded *(with respect to* μ*), i.e.,*

$$\int\_{X} d\mu \, |f|^{p} < \ast. \tag{B.82}$$

• <sup>L</sup> <sup>∞</sup>(*X*) <sup>≡</sup> <sup>L</sup> <sup>∞</sup>(*X*,Σ,μ) *is the set of measurable functions f* : *<sup>X</sup>* <sup>→</sup> <sup>C</sup> *for which*

$$\inf \{ t \in [0, \Leftrightarrow] : |f| \le t \text{ ( $\mu$ -almost everywhere)} \} < \infty. \tag{B.83}$$

• N<sup>μ</sup> *is the set of all measurable functions f* : *X* → C *that vanish* μ*-a.e., that is,*

$$
\mu\left(\{\mathbf{x}\in X \mid f(\mathbf{x})\neq \mathbf{0}\}\right) = \mathbf{0}.\tag{\mathbf{B}.84}
$$

• *Noting that* <sup>N</sup><sup>μ</sup> <sup>⊂</sup> <sup>L</sup> *<sup>p</sup>*(*X*) *for all* <sup>1</sup> <sup>≤</sup> *<sup>p</sup>* <sup>≤</sup> <sup>∞</sup>*, we put*

$$L^p(X, \Sigma, \mu) \equiv L^p(X) = \mathcal{L}^p(X) / \mathcal{A}\_{\mu} \,. \tag{B.85}$$

To appreciate the perhaps somewhat mysterious condition (B.83), we write

$$\inf \{ t \in [0, \ast \mathbb{}] : |f| \le t \,\mu-\text{a.e.} \} = \inf \{ t \in [0, \ast \mathbb{}] : \mu(\{ x \in X, |f(x)| > t \}) = 0 \}.$$

Compare this with the expressions (defined for any function *f* : *X* → C):

$$\begin{aligned} \sup\{ |f(\mathbf{x})| \, | \, \mathbf{x} \in X \} &= \inf\{ t \in [0, \ast \mathbb{I} | f(\mathbf{x})| \le t \,\forall \mathbf{x} \in X \} \\ &= \inf\{ t \in [0, \ast \mathbb{I} | \, \mathbf{x} \in X, |f(\mathbf{x})| > t \} = \mathbf{0} \} < \infty, \text{ (B.86)} \end{aligned}$$

which state the condition that *f* be bounded. Consequently, the stipulation that *f* be *essentially* bounded is the same as the condition that it is bounded, expect that the empty set in (B.86) has been replaced by a measure-zero set.

Theorem B.29. *For* <sup>1</sup> <sup>≤</sup> *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>*, the set Lp*(*X*) *is a vector space under pointwise operations, as well as a Banach space, in the norm*

$$||f||\_p = \left(\int\_X d^n \mathbf{x} \, |f(\mathbf{x})|^p \right)^{1/p} \,. \tag{\mathbf{B}.87}$$

*Likewise, L*∞(*X*) *is a Banach space in the norm*

$$\|\|f\|\|\_{\infty}^{\mathrm{ess}} = \inf \{ t \in [0, \infty] : \mu(\{\mathbf{x} \in X, |f(\mathbf{x})| > t\}) = 0 \}. \tag{\mathbf{B.88}}$$

Strictly speaking, elements of *L<sup>p</sup>* are therefore equivalence classes of functions rather than functions, the pertinent equivalence relation ∼<sup>μ</sup> being

$$f \sim\_{\mu} \mathfrak{g} \text{ iff } \mu \left( \{ \mathbf{x} \in \mathcal{X} \mid f(\mathbf{x}) \neq \mathbf{g}(\mathbf{x}) \} \right) = \mathbf{0},\tag{\mathbf{B.89}}$$

but whenever no confusion can arise, we write *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup><sup>p</sup>* instead of *<sup>f</sup>* <sup>∈</sup> <sup>L</sup> *<sup>p</sup>* or [ *<sup>f</sup>* ] <sup>∈</sup> *<sup>L</sup>p*, as we have already done, for example, in (B.87) and (B.88); that is, the left-hand sides of these equations should officially be written as [ *f* ]*<sup>p</sup>* for 1 ≤ *p* ≤ ∞. Note in this respect that in (B.87) - (B.88) the function *f* on the right-hand side could be any representative of its equivalence class [ *f* ]. However, one cannot replace the right-hand side of (B.88) by *f* ∞, because (B.86) *does* depend on the representative *f* . Those who dislike (B.88) may, equivalently, write

$$\|\|f\|\|\_{\infty}^{\mathrm{ess}} = \inf\{\|\|g\|\|\_{\infty}, \mathcal{g} \sim\_{\mu} f\}.\tag{\mathbb{B}.90}$$

One should be aware of the need to pass to the quotient (B.85) in the first place: the natural expressions (B.87) and (B.88) fail to define norms on L *<sup>p</sup>* and L <sup>∞</sup>, respectively, because the positive definiteness axiom in Definition A.1.5c might fail. Indeed, although any *f* that is nonzero just on some null set is nonzero as an element of the vector space <sup>L</sup> *<sup>p</sup>*, one has *<sup>f</sup> <sup>p</sup>* <sup>=</sup> 0. This problem is solved by passing to *<sup>L</sup>p*.

The proof of Theorem B.29 uses both parts of Theorem B.13, which is concerned with a sequence (*fn*) of functions in L <sup>1</sup>(*X*), where (*X*,Σ,μ) is an arbitrary measure space. Note that on our definition of *L<sup>p</sup>* spaces, these pointwise limits themselves might not lie in L 1, but it is part of the conclusion of the convergence theorems that they do so up to some null set, and hence do define elements of *L*1. For this reason, at this point one must distinguish between *<sup>f</sup>* <sup>∈</sup> <sup>L</sup> <sup>1</sup> and [ *<sup>f</sup>* ] <sup>∈</sup> *<sup>L</sup>*1. Let us mention in this context that *<sup>L</sup><sup>p</sup>* spaces are often constructed from measurable functions *<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>C</sup>, whose positive real parts *fk* (cf. (B.31)) by definition take values in [0,∞]. This also leads to slightly more general versions of the Lebesgue convergence theorems, in which the *fn* are allowed to be infinite on null sets. However, if *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>p*, then <sup>|</sup> *<sup>f</sup>* <sup>|</sup> <sup>&</sup>lt; <sup>∞</sup> μ-a.e., so little is lost by starting from functions *f* : *X* → C or *f* : *X* → R.

*Proof.* We first prove Theorem B.29 for 1 ≤ *p* < ∞. Minkowski's Inequality (B.14) holds for *<sup>L</sup><sup>p</sup>* <sup>≡</sup> *<sup>L</sup>p*(*X*) just as it does for *<sup>p</sup>*, as does Holder's Inequality (B.15), so ¨ it remains to prove completeness. To this effect, let (*fn*) a Cauchy sequence in *Lp*. Then (*fn*) has a subsequence (*fnk* )*<sup>k</sup>* such that

536 B Basic functional analysis

$$\|f\_{n\_{k+1}} - f\_{n\_k}\|\_p < 2^{-k} \tag{B.91}$$

for each *<sup>k</sup>* <sup>∈</sup> <sup>N</sup> (indeed, for given <sup>ε</sup> <sup>=</sup> <sup>2</sup>−*k*, take *nk* to be the famously existing *<sup>N</sup>* for which *fn* − *fm<sup>p</sup>* < ε for all *n*,*m* > *N*, etc.), and if lim*k*→<sup>∞</sup> *fnk* − *f <sup>p</sup>* = 0 for some *f* , then lim*n*→<sup>∞</sup> *fn* − *f <sup>p</sup>* = 0 (this is a standard feature of Cauchy subsequences).

We now rewrite *fnk*+<sup>1</sup> using a little trick, and introduce an auxiliary function *g* by

$$f\_{n\_k} = f\_{n\_l} + \sum\_{l=1}^{k-1} (f\_{n\_{l+1}} - f\_{n\_l});\tag{B.92}$$

$$\mathbf{g}\_{n\_k} = |f\_{n\_l}| + \sum\_{l=1}^{k-1} |f\_{n\_{l+1}} - f\_{n\_l}|.\tag{\mathbf{B.93}}$$

Using (B.91), we estimate *gnk<sup>p</sup>* ≤ *fn*<sup>1</sup> *<sup>p</sup>* <sup>+</sup>∑*k*−<sup>1</sup> *<sup>l</sup>*=<sup>1</sup> <sup>2</sup>−*<sup>l</sup>* , which converges as *k* → ∞. Hence sup*<sup>k</sup> g<sup>p</sup> nk*<sup>1</sup> <sup>&</sup>lt; <sup>∞</sup>, so by the *Monotone* Convergence Theorem, lim*k*→<sup>∞</sup> *<sup>g</sup><sup>p</sup> nk* ≡ *h* exists pointwise <sup>μ</sup>-a.e., with *<sup>h</sup>* <sup>∈</sup> *<sup>L</sup>*1. Since *gnk* <sup>≥</sup> 0, we have *<sup>h</sup>* <sup>≥</sup> 0 at least <sup>μ</sup>-a.e., and with *<sup>g</sup>* <sup>=</sup> *<sup>h</sup>*1/*p*, by continuity of *<sup>x</sup>* → *<sup>x</sup>*1/*p*, we have *gnk* <sup>→</sup> *<sup>g</sup>* pointwise <sup>μ</sup>-a.e., with *<sup>g</sup>* <sup>∈</sup> *<sup>L</sup>p*. Thus the series (B.92) converges (absolutely pointwise <sup>μ</sup>-a.e.) to some *<sup>f</sup>* . Since <sup>|</sup> *<sup>f</sup>* | ≤ *<sup>g</sup>*, we also have *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>p*. To prove that *fnk* <sup>→</sup> *<sup>f</sup>* in *<sup>L</sup><sup>p</sup>* (and not just pointwise μ-a.e.), we estimate

$$\begin{aligned} |f(\mathbf{x}) - f\_{\mathfrak{n}\_k}(\mathbf{x})|^p &\le (2\max\{|f(\mathbf{x})|, |f\_{\mathfrak{n}\_k}(\mathbf{x})|\})^p \\ &\le 2^p (|f(\mathbf{x})| + |f\_{\mathfrak{n}\_k}(\mathbf{x})|)^p \le 2^{p+1} g(\mathbf{x})^p, \end{aligned}$$

so, already knowing that *<sup>g</sup><sup>p</sup>* <sup>∈</sup> *<sup>L</sup>*1, we may use (B.36) in the *Dominated* Convergence Theorem (with *fn* replaced by *f* − *fnk* , and hence *f* replaced by the zero function) to conclude that lim*k*→<sup>∞</sup> *<sup>X</sup> d*μ| *f*(*x*)− *fnk* (*x*)| *<sup>p</sup>* <sup>=</sup> 0, i.e., *<sup>f</sup>* <sup>−</sup> *fnk<sup>p</sup>* <sup>→</sup> 0.

We continue for *p* = ∞. For any fixed measurable subset *E* ⊂ *X* we define

$$\|f\|\_{\ast}^{(E)} = \sup\{|f(\mathbf{x})| \mid \mathbf{x} \in E\} = \inf\{t \in [0, \ast] \mid |f(\mathbf{x})| \le t \,\forall \mathbf{x} \in E\}.\tag{\mathbf{B.94}}$$

If *X*\*E* has measure zero, as we assume in what follows, then

$$\|f\|\_{\infty}^{\text{ess}} \le \|f\|\_{\infty}^{(E)},\tag{B.95}$$

since *E* might be expanded to a larger set of measure zero, which might decrease the infimum in (B.88). It follows that convergence with respect to the norm ·(*E*) <sup>∞</sup> implies convergence in ·ess <sup>∞</sup> . We use this insight to prove the completeness of *L*<sup>∞</sup> by reducing this to a limiting problem with respect to the norm ·(*E*) <sup>∞</sup> , for a suitable choice of *<sup>E</sup>* <sup>⊂</sup> *<sup>X</sup>*. Namely, let (*fn*) be a Cauchy sequence in *<sup>L</sup>*∞. This means

$$\forall \mathfrak{e} > 0 \, \exists n \, \forall\_{j,k>n} \, ||f\_j - f\_k||\_{\infty}^{\infty s} < \mathfrak{e} \, \mathsf{A}$$

Parametrizing ε = 1/*m* for large *m* ∈ N, and using (B.88), this implies:

$$\forall m \exists n \forall\_{j,k>n} \exists N\_{(j,k,m)} : \mu(N\_{(j,k,m)}) = 0 \text{ and } \forall \mathbf{x} \in X \backslash N\_{(j,k,m)} : |f\_j(\mathbf{x}) - f\_k(\mathbf{x})| < 1/m.$$

Now define *N* = ∪*j*,*k*,*m*∈N*N*(*j*,*k*,*m*). Since measures are countably additive by definition and *N* is a countable union of the measure zero sets, *N* has measure zero. With *E* = *X*\*N*, so that *X*\*E* = *N* has measure zero, as above, we then have

$$\forall m \exists n \forall\_{j,k>n} \forall \mathbf{x} \in E \left| f\_j(\mathbf{x}) - f\_k(\mathbf{x}) \right| < 1/m.$$

Thus (*fn*) (strictly speaking, the corresponding sequence of restrictions of each *fn* to *E*) is a Cauchy sequence of bounded functions on *E* in the supremum norm (B.94), so that we are back in the -<sup>∞</sup>(*X*) case with *X* = *E*, with the three-step proof we gave: the pointwise limits *f*(*x*) = lim*n*→<sup>∞</sup> *fn*(*x*) exist, the function *f* thus defined on *E* is bounded, i.e., *f* (*E*) <sup>∞</sup> <sup>&</sup>lt; <sup>∞</sup>, and *fn* <sup>→</sup> *<sup>f</sup>* not just pointwise but also in the norm ·(*E*) <sup>∞</sup> . Extending *<sup>f</sup>* from *<sup>E</sup>* to *<sup>X</sup>* in an arbitrary way (the ensuing equivalence class in *<sup>L</sup>*<sup>∞</sup> does not depend on the behaviour of *<sup>f</sup>* on the null set *<sup>X</sup>*\*E*), we first conclude from (B.95) that *<sup>f</sup>* ess <sup>∞</sup> <sup>&</sup>lt; <sup>∞</sup>, and secondly infer that *fn* <sup>→</sup> *<sup>f</sup>* also in ·ess <sup>∞</sup> . -

Without proof, we state some useful results about the place of continuous functions in *Lp*-spaces. For simplicity, we assume that μ is regular and has support *X* (in that *X* has no open subset *U* with μ(*U*) = 0). In that case, *Cb*(*X*) and its subspaces *C*0(*X*) and *Cc*(*X*) may be seen as subspaces of *L*∞(*X*), on which the norm (B.88) or (B.90) simply reduces to the ordinary sup-norm (1.24).

Theorem B.30. • *If* <sup>1</sup> <sup>≤</sup> *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>*, then Cc*(*X*) *is dense in Lp*(*X*) *(in the Lp-norm).* • *If p* <sup>=</sup> <sup>∞</sup>*, one has an inclusion of Banach spaces (all carrying the L*∞*-norm)*

$$\mathcal{C}\_0(X) \subset \mathcal{C}\_b(X) \subset L^\infty(X). \tag{B.96}$$

Compare (B.22). Since the closure *Cc*(*X*) is *C*0(*X*), it follows that *Cc*(*X*) is dense in *L*∞(*X*) only in the exceptional case where *L*∞(*X*) = *C*0(*X*) (e.g., for finite *X*). So in this respect, the values 1 ≤ *p* < ∞ behave quite differently from *p* = ∞.

The first claim is based on two facts, of which the first is true for all 1 ≤ *p* ≤ ∞, whereas the second is valid only for 1 ≤ *p* < ∞ (i.e. it fails for *p* = ∞):


Similarly to Theorems B.27 and B.24, we know the state space of *L*∞(*X*,ν):

Theorem B.31. *Let* (*X*,Σ,ν) *be a measure space. There is a bijective correspondence between states on L*∞(*X*,ν) *and finitely additive probability measures* μ *on* (*X*,Σ) *that are absolutely continuous with respect to* ν *(i.e.,* ν(*A*) = 0 *implies* μ(*A*) = 0*), given by* (B.39) *and* (B.81)*.*

In this case, the role of the semiring R is of course played by Σ, so that Step(*X*,Σ) is simply the complex linear span of the simple functions on (*X*,Σ), and (B.28) duly applies. Since it may once again be shown that Step(*X*,Σ) is dense in *L*∞(*X*,ν), the definition (B.81) of integration "by continuity" makes sense in this situation, too.

#### B.7 Morphisms and isomorphisms of Banach spaces

We often want to say that two Banach spaces are *isomorphic*. For example, in the next section the dual of a given Banach space is typically identified with some known Banach space; such identifications even belong to the nicest results in functional analysis. Of course, this issue is predicated on the correct definition of (not necessarily invertible) maps between Banach spaces in the first place.

Definition B.32. *A* morphism *a* : *V* → *W between Banach spaces V*,*W (or, more generally, normed spaces) is a* bounded linear map*, i.e., a linear map for which there is a constant C* > 0 *such that for each v* ∈ *V ,*

$$\|\|av\|\|\_{W} \leq C\|\|\boldsymbol{\nu}\|\|\_{V},\tag{\mathbb{B}.97}$$

*or, equivalently,*

$$\sup\{||av||\_W, \nu \in V, ||\nu||\_V \le 1\} < \infty. \tag{B.98}$$

It is extremely important (yet easy to show) that bounded maps are automatically continuous (and even uniformly continuous); conversely, a continuous linear map between vector spaces with norm is bounded. We note two important special cases:


Theorem B.33. *Let V be a normed vector space and W a Banach space. The space B*(*V*,*W*) *of all morphisms (i.e., bounded linear maps) a* : *V* → *W is a Banach space with respect to pointwise operations (e.g.,* (λ*a*+*b*)*v* = λ*av*+*bv*)*, and the norm*

$$||a|| = \sup\{||av||\_W, \nu \in V, ||\nu||\_V \le 1\}.\tag{B.99}$$

*Proof.* Only completeness is nontrivial; the idea is that if (*an*) is a Cauchy sequence in *B*(*V*,*W*), we define *a* : *V* → *W* by *av* = lim*<sup>n</sup> anv*. This limit exists, since we have *anv* − *amv<sup>W</sup>* ≤ *an* − *amv<sup>V</sup>* . Furthermore, it is easy to show (e.g., by contradiction) that a Cauchy sequence must be bounded, say *an* ≤ *K*, and that, if *anv* → *w*, then also *anv<sup>W</sup>* → *w<sup>W</sup>* . Hence *av<sup>W</sup>* = lim*<sup>n</sup> anv<sup>W</sup>* ≤ *Kv<sup>V</sup>* , so *a* ∈ *B*(*V*,*W*). Finally, *an* → *a*, since for *v<sup>V</sup>* ≤ 1 and, given ε > 0, the usual *N* for which *an* −*am* < ε/2 for all *n*,*m* > *N* and *av*−*amv<sup>W</sup>* < ε/2 for all *m* > *N*,

$$||av - a\_n \nu||\_W \le ||av - a\_m \nu||\_W + ||a\_m \nu - a\_n \nu||\_W < \frac{1}{2} \mathcal{E} + \frac{1}{2} \mathcal{E} = \mathcal{E}.\tag{B.100}$$

Since this holds for any *v* ∈ *V* with *v<sup>V</sup>* ≤ 1, eq. (B.99) gives *a*−*an* < ε. -

Clearly, if *a* ∈ *B*(*V*,*W*), then one has the useful estimate, cf. (A.20),

$$\|a\mathbf{v}\|\_{W} \le \|a\| \|\mathbf{v}\|\_{V},\tag{\mathbf{B}.101}$$

and if *W* = *V* and *a*,*b* ∈ *B*(*V*) ≡ *B*(*V*,*V*), we also have (cf. (A.21))

$$\|ab\| \le \|a\| \|b\|. \tag{B.102}$$

Indeed, *B*(*V*) is a *Banach algebra*, which is just to say that it is a Banach space as well as an algebra, in which (B.102) holds (a C\*-algebra will be a special case).

Returning to our opening theme, the level of discourse now suddenly becomes quite advanced. We start with Banach's famous *Open Mapping Theorem*.

Theorem B.34. *if V and W are Banach spaces and a* ∈ *B*(*V*,*W*) *is surjective, then a is open (in mapping open sets to open sets).*

*Proof.* For fixed *u* ∈ *V* we write *Vr*(*u*) = {*v* ∈ *V* : *u*−*v* < *r*} for the open *r*-ball around *u*, with *Vr* ≡ *Vr*(0) and hence *Vr*(*u*) = *u* +*Vr*. Furthermore, the closure of *U* ⊂ *V* is denoted by *U*−. Likewise for *W*. The theorem follows if *aV*<sup>1</sup> ≡ *a*(*V*1) ⊂ *W* contains an open ball *Ws*, for some *s* > 0 (in which case, by linearity, *aVr* contains an open ball *Wrs* for any *r* > 0). By the theory of metric spaces, some subset *U* ⊂ *V* is open iff for any *u* ∈ *U* there is *r* > 0 such that *Vr*(*u*) ⊂ *U*. Then *aU* contains the open set *Wrs*(*au*), and since *au* ∈ *aU* is arbitrary, *aU* is open by the same criterion.

To prove that *aV*<sup>1</sup> contains an open ball, first note that since *a* : *V* → *W* is surjective, *W* = ∪*naVn*, so that by the Baire Category Theorem (which applies because Banach spaces are complete metric spaces by definition) some (*aVn*)<sup>−</sup> contains an open set, and hence an open ball. Since *a* is linear this must then be true for all *n*; let us take *n* = 1, so that *W*<sup>ε</sup> (*w*0) ⊂ (*aV*1)<sup>−</sup> for some *w*<sup>0</sup> ∈ (*aV*1)−. Since any point in the closure of some *U* ⊂ *W* can be approximated by points in *U*, there is *w*<sup>1</sup> ∈ *aV*<sup>1</sup> such that *w*<sup>1</sup> −*w*0 < <sup>1</sup> <sup>2</sup> ε. Hence for any *w* ∈ *W*ε/<sup>2</sup> we have

$$\left\| \left( \left( \mathbf{w}\_1 - \mathbf{w} \right) - \mathbf{w}\_0 \right\| \right\| \le \left\| \left| \mathbf{w}\_1 - \mathbf{w}\_0 \right\| \right\| + \left\| \left| \mathbf{w} \right\| \right\| < \frac{1}{2} \mathfrak{E} + \frac{1}{2} \mathfrak{E} = \mathfrak{E},\tag{\mathbf{B}.103}$$

so *w*<sup>1</sup> −*w* ∈ *W*<sup>ε</sup> (*w*0) and hence *w*<sup>1</sup> −*w* ∈ (*aV*1)−. Similarly, *w*<sup>1</sup> +*w* ∈ (*aV*1)−. Since *w* = <sup>1</sup> <sup>2</sup> (*w*1+*w*)− <sup>1</sup> <sup>2</sup> (*w*1−*w*), we obtain *<sup>w</sup>* <sup>∈</sup> (*aV*1)−, for if *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> (*aV*1)−, then we have <sup>1</sup> <sup>2</sup> (*x*±*y*) ∈ (*aV*1)−. Since *w* ∈ *W*ε/<sup>2</sup> was arbitrary, it follows that *W*ε/<sup>2</sup> ⊂ (*aV*1)−.

To produce an open ball in *aV*<sup>1</sup> rather than in its closure, let *w* <sup>0</sup> ∈ *W*ε/4, so that 2*w* <sup>0</sup> ∈ *W*ε/2. Hence there exists *w* <sup>1</sup> ∈ *aV*<sup>1</sup> such that 2*w* <sup>0</sup> −*w* <sup>1</sup> < ε/4. And because 2(2*w* <sup>0</sup> −*w* <sup>1</sup>) ∈ *W*ε/2, there exists *w* <sup>2</sup> ∈ *aV*<sup>1</sup> such that 2(2*w* <sup>0</sup> −*w* 1)−*w* <sup>2</sup> < ε/4, *et cetera*. Because 2(2(2*w* <sup>0</sup> −*w* 1)−*w* <sup>2</sup>) ∈ *W*ε/2, there exists *w* <sup>3</sup> ∈ *aV*1,...

Repeating this *N* times, we obtain a sequence (*w <sup>n</sup>*) in *aV*<sup>1</sup> such that for any *N* ∈ N,

$$\|\mathbf{2}^{N}\mathbf{w}\_{0}^{\prime} - \mathbf{2}^{N-1}\mathbf{w}\_{1}^{\prime} - \dots - \mathbf{2}^{1}\mathbf{w}\_{N-1}^{\prime} - \mathbf{2}^{0}\mathbf{w}\_{N}^{\prime}\| < \varepsilon/4,\tag{\mathbf{B}.104}$$

i.e., *w* <sup>0</sup> <sup>−</sup>∑*<sup>N</sup> <sup>n</sup>*=<sup>1</sup> 2−*nw <sup>n</sup>* <sup>&</sup>lt; <sup>2</sup>−*N*−2ε. Letting *<sup>N</sup>* <sup>→</sup> <sup>∞</sup> then gives *<sup>w</sup>* <sup>0</sup> = ∑<sup>∞</sup> *<sup>n</sup>*=<sup>1</sup> 2−*nw n*.

Since *w <sup>n</sup>* ∈ *aV*1, there is a corresponding sequence (*v <sup>n</sup>*) in *V*<sup>1</sup> such that *av <sup>n</sup>* = *w n*, with *v <sup>n</sup>* <sup>&</sup>lt; 1 for each *<sup>n</sup>*. Hence we may estimate ∑∞ *<sup>n</sup>*=<sup>1</sup> 2−*nv <sup>n</sup>* <sup>&</sup>lt; <sup>∑</sup><sup>∞</sup> *<sup>n</sup>*=<sup>1</sup> 2−*<sup>n</sup>* = 1, so the series ∑*<sup>n</sup>* 2−*nv <sup>n</sup>* in *V* is absolutely convergent and hence convergent. Since *V* is assumed complete, it has a limit *v* = ∑<sup>∞</sup> *<sup>n</sup>*=<sup>1</sup> 2−*nv <sup>n</sup>*. Since

$$\left\| \left| \sum\_{n=1}^{N} 2^{-n} \nu\_n' \right| \right| \le \sum\_{n=1}^{N} \left|| 2^{-n} \nu\_n' || < \sum\_{n=1}^{\infty} ||2^{-n} \nu\_n'|| < 1,$$

letting *N* → ∞ gives *v* ≤ 1, or *v* ∈*V* <sup>−</sup> <sup>1</sup> . Since *a* is bounded and hence continuous,

$$a\nu'=a\left(\sum\_{n=1}^{\infty}2^{-n}\nu'\_n\right)=\sum\_{n=1}^{\infty}2^{-n}a\nu'\_n=\sum\_{n=1}^{\infty}2^{-n}\nu'\_n=\nu'\_0.\tag{B.105}$$

We now recall that *w* <sup>0</sup> ∈ *W*ε/<sup>4</sup> was arbitrary, so we have shown that *W*ε/<sup>4</sup> ⊂ *a*(*V* <sup>−</sup> <sup>1</sup> ). By linearity of *a*, it follows that *Ws* ⊂ *aV*<sup>1</sup> for any *s* < ε/4. -

Corollary B.35. *Let V and W be Banach spaces. The (set-theoretic) inverse a*−<sup>1</sup> *of a* bijective *morphism a* ∈ *B*(*V*,*W*) *is automatically linear and bounded.*

In other words, *a*−<sup>1</sup> lies in *B*(*W*,*V*). Corollary B.35 suggests defining two Banach spaces *V* and *W* to be isomorphic if there exists a bijective morphism *a* ∈ *B*(*V*,*W*) (in which case they would be isomorphic as objects in the category of Banach spaces with bounded linear maps). However, we often prefer to use a sharper notion.

Definition B.36. *Let V and W be normed spaces.*

*1. An* isometry *from V to W is a linear map u* : *V* → *W satisfying*

$$\|\|av\|\|\_{\mathcal{W}} = \|\|\boldsymbol{\nu}\|\|\_{\boldsymbol{V}}, \ \boldsymbol{\nu} \in \boldsymbol{V}.\tag{B.106}$$

#### *2. An* isometric isomorphism *from V to W is a surjective isometry u* : *V* → *W.*

Since an isometry is clearly bounded as well as injective, by Corollary B.35 a surjective isometry has a bounded linear inverse, which is easily seen to be isometric, too. In practice, it is the conditions in Definition B.36 that one typically checks.

Nonetheless, the non-isometric case is also quite important. As a case in point, we prove a classical result of functional analysis, called the *Closed Graph Theorem*. In preparation, note that two normed spaces *V*,*W* define a third one, called their *direct sum V* ⊕*W*, which as a set is *V* ×*W*, turned into a vector space by the operations (*v*1,*w*1)+(*v*2,*w*2)=(*v*<sup>1</sup> +*v*2,*w*<sup>1</sup> +*w*2) and λ(*v*,*w*)=(λ*v*,λ*w*), etc., with norm

$$\|(\nu, w)\| = \|\nu\|\_{V} + \|w\|\_{W}.\tag{B.107}$$

It is easily shown that if *V* and *W* are Banach spaces, then so is *V* ⊕*W*.

Furthermore, if *a* : *V* → *W* is any linear map, the *graph* of *a* is the vector space

$$G(a) = \{ (\nu, av), \nu \in V \} \subset V \oplus W. \tag{B.108}$$

If *a* is bounded, then *G*(*a*) is closed (i.e. in the norm inherited from the Banach space *V* ⊕*W*). The converse, then, is the *Closed Graph Theorem*:

Theorem B.37. *Let V and W be Banach spaces and let a* : *V* →*W be a linear map. If the graph G*(*a*) *is closed (in the norm inherited from V* ⊕*W ), then a is bounded.*

*Proof.* Let *b* : *G*(*a*) → *V* be the linear map (*v*,*av*) → *v*, which is clearly a bijection, with inverse *<sup>b</sup>*−<sup>1</sup> : *<sup>V</sup>* <sup>→</sup> *<sup>G</sup>*(*a*), *<sup>b</sup>*−1(*v*)=(*v*,*av*). Furthermore, *b*(*v*,*av*) <sup>=</sup> *v<sup>V</sup>* ≤ *v<sup>V</sup>* + *av<sup>W</sup>* = (*v*,*av*), so *b* is bounded. Hence Corollary B.35 makes *<sup>b</sup>*−<sup>1</sup> bounded as well, i.e., *b*−1(*v*) ≤ *<sup>C</sup>v<sup>V</sup>* for some *<sup>C</sup>* <sup>&</sup>gt; 0. Hence (*v*,*av*) <sup>=</sup> *v<sup>V</sup>* +*av<sup>W</sup>* ≤ *Cv<sup>V</sup>* . So *av<sup>W</sup>* ≤ (*C* −1)*v<sup>V</sup>* , and hence *a* is bounded. -

#### B.8 The Hahn–Banach Theorem

In this section we present another traditional pillar of functional analysis.

Definition B.38. *A* sublinear functional *on a* real *vector space V is a map p* : *V* → R *that for each v*,*w* ∈ *V and scalars t* ≥ 0 *satisfies*

$$p(\nu + \nu) \le p(\nu) + p(\nu);\tag{B.109}$$

$$p(t\nu) = tp(\nu). \tag{B.110}$$

We will deal with two examples of such functionals. One is simply a norm (even on a complex vector space, which in particular is a real vector space). For the other, recall that a subset *K* of a real vector space *V* is called *convex* if whenever *v*,*w* ∈ *K* and *t* ∈ (0,1), one has *tv*+ (1−*t*)*w* ∈ *K*. Even without a topology on *V*, we can define an *interior point* of *K* (or indeed of any subset of *V*) as a point *v* ∈ *K* such that for each *v* ∈ *V* there is ε > 0 such that *v* +*tv* ∈ *K* for any 0 < *t* < ε. We denote the set of interior points of *K* by int(*K*). For example, if *V* is normed (with associated topology), or is the dual of a normed space equipped with the *w*∗-topology (or, even more generally, if *V* is a *topological vector space*, i.e., a vector space carrying a Hausdorff topology in which addition and scalar multiplication are continuous), then each point of an open set *U* is interior in the above sense, so that *U* = int(*U*).

Let *K* ⊂ *V* be convex and suppose it contains 0 as an interior point. Then the indexfunctional!Minkowski*Minkowski functional* (also called *gauge*) *<sup>p</sup>* : *<sup>V</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> of *K* is defined by

$$p(\mathbf{v}) = \inf\{a > 0 \mid \mathbf{v}/a \in K\}.\tag{\mathbf{B}.111}$$

Note that *p*(*v*) < ∞, because 0 ∈ *K* is interior, so that there is ε > 0 such that ε*v* ∈ *K*, and hence *a* = 1/ε lies in the set in (B.111). It is clear that if *v* ∈ *K*, then *a* = 1 lies in the set in (B.111), so that *p*(*v*) ≤ 1. As a simple example, for the (open or closed) unit ball *B* in a normed space (both of which are convex), we have *p*(*v*) = *v*.

Proposition B.39. *Let K* ⊂ *V be convex and let* 0 ∈ *K be an interior point of K. Then the Minkowski functional p of K satisfies* (B.109) *-* (B.110)*. Furthermore, we may recover the set* int(*K*) *of interior points of K through*

$$\text{int}(K) = \{\nu \in V \mid p(\nu) < 1\}. \tag{B.112}$$

*Conversely, if some function p* : *<sup>V</sup>* <sup>→</sup> <sup>R</sup><sup>+</sup> *satisfies* (B.109) *-* (B.110)*, then the set*

$$K = \{ \nu \in V \mid p(\nu) \le 1 \}\tag{\mathbb{B}.113}$$

*is convex, with interior given by* (B.112)*.*

For example, if *K* is open (in a topological vector space), then (B.112) equals *K*.

*Proof.* Given (B.111), eq. (B.110) is obvious. To prove (B.109), find *a* > 0 and *b* > 0 such that *v*/*a* ∈ *K* and *w*/*b* ∈ *K*; cf. the comment after (B.111). Since *K* is convex, with *t* = *a*/(*a*+*b*) and hence 1−*t* = *b*/(*a*+*b*) we have *t* · *v*/*a*+ (1−*t*)·*w*/*b* ∈ *K*. Hence *p*(*t* · *v*/*a*+ (1−*t*)·*w*/*b*) ≤ 1, which, using (B.110), reads *p*(*v*+*w*) ≤ *a*+*b*. Taking the infimum over *a* and *b* constrained by *v*/*a* ∈ *K*, *w*/*b* ∈ *K* then turns the right-hand side into *p*(*v*) + *p*(*w*), so that we have proved (B.109).

The proof of the converse claims is almost trivial, except perhaps for the last claim. To prove that *p*(*v*) < 1 implies *v* ∈ int(*K*), we note that for any *v* ∈ *V* and ε > 0, from (B.109) - (B.110) we have *p*(*v*+ε*v* ) ≤ *p*(*v*)+ε *p*(*v* ). If *p*(*v* ) = 0, this gives *p*(*v* + ε*v* ) ≤ *p*(*v*) < 1, so that *v* + ε*v* ∈ *K*. If not, assume *p*(*v*) = 1 − δ for some δ ∈ (0,1], and we find that *p*(*v*+ε*v* ) < 1 for any 0 < ε < δ/*p*(*v* ). -

Having motivated Definition B.38, we now state the *Hahn–Banach Theorem*:

Theorem B.40. *Let V be a real vector space equipped with a sublinear functional p, and let W* ⊂ *V be a linear subspace carrying a linear map* ϕ*<sup>W</sup>* : *W* → R *that is dominated by p in the sense that for each w* ∈ *V we have* ϕ*<sup>W</sup>* (*w*) ≤ *p*(*w*)*.*

*Then* ϕ*<sup>W</sup> has a linear extension* ϕ : *V* → R *that for each v* ∈ *V satisfies*

$$
\mathfrak{g}(\nu) \le p(\nu). \tag{\mathbb{B}.114}
$$

*Proof.* Take *v*<sup>1</sup> ∈ *V*, *v*<sup>1</sup> ∈/ *W*, and extend ϕ*<sup>W</sup>* to *W* ⊕R· *v*<sup>1</sup> by

$$
\mathfrak{q}(\omega + t\nu\_1) = \mathfrak{q}\mathfrak{q}(\omega) + t\mathfrak{q}(\nu\_1),
\tag{B.115}
$$

with *t* ∈ R and ϕ(*v*1) to be determined. In order to satisfy (B.114), we need

$$
\mathfrak{g}(\boldsymbol{w} + t\boldsymbol{v}\_1) \le p(\boldsymbol{w} + t\boldsymbol{v}\_1),
\tag{B.116}
$$

for each *w* ∈ *W* and *t* ∈ R. Using (B.110), this is true iff it is true for *t* ± 1, which yields two conditions (in two variables *w*,*w* ∈ *W*), which may jointly be written as

$$
\mathfrak{p}(\boldsymbol{\w}') - p(\boldsymbol{\w}' - \boldsymbol{\nu}\_1) \le \mathfrak{p}(\boldsymbol{\nu}\_1) \le p(\boldsymbol{\w} + \boldsymbol{\nu}\_1) - \mathfrak{p}(\boldsymbol{\nu}).\tag{B.117}
$$

Since ϕ is linear, this can obviously be satisfied by some ϕ(*v*1) ∈ R iff

$$p(\boldsymbol{\omega} + \boldsymbol{\omega}') \le p(\boldsymbol{\omega} + \boldsymbol{\nu}\_1) + p(\boldsymbol{\omega}' - \boldsymbol{\nu}\_1),\tag{B.118}$$

which is indeed the case: for by assumption we have ϕ(*w*+*w* ) ≤ *p*(*w*+*w* ), whence

$$\mathfrak{sp}(\boldsymbol{\omega} + \boldsymbol{\omega}') \le p(\boldsymbol{\omega} + \boldsymbol{\nu}\_1 + \boldsymbol{\nu}' - \boldsymbol{\nu}\_1) \le p(\boldsymbol{\omega} + \boldsymbol{\nu}\_1) + p(\boldsymbol{\nu}' - \boldsymbol{\nu}\_1),\tag{B.119}$$

where we used (B.109). Hence any choice of ϕ(*v*1) that satisfies (B.117) provides an extension (B.115) of ϕ to *W* ⊕R· *v*1, which by construction satisfies (B.114).

Lovers of Zorn's Lemma may now complete the proof as follows. Let *F* be the set of all pairs (ϕ,*X*), where *X* ⊆ *V* is a linear subspace and ϕ : *X* → R is a linear extension of ϕ*<sup>W</sup>* that satisfies (B.114). We partially order *F* by

$$\Phi(\mathfrak{q}\_1, X\_1) \lesssim (\mathfrak{q}\_2, X\_2) \text{ iff } X\_1 \subseteq X\_2 \text{ and } \mathfrak{q}\_1(\mathbf{v}) = \mathfrak{q}\_2(\mathbf{v}) \,\forall \mathbf{v} \in X\_1. \tag{B.120}$$

Then *F* is clearly nonempty, and every totally ordered subset {(*X*<sup>λ</sup> ,ϕλ )} of *F* has an upper bound (ϕ,*X*), where *X* = ∪λ*X*<sup>λ</sup> and ϕ(*v*) = ϕλ (*v*) whenever *v* ∈ *X*<sup>λ</sup> .

Thus Zorn's Lemma applies, "giving" a maximal element (ϕ,*Z*). If *Z* =*V*, one may extend *Z* by the first step of the proof (applied to *W Z*), contradicting maximality of (ϕ,*Z*). Hence *Z* = *V*, and ϕ is the desired functional. -

If *V* is finite-dimensional, then Zorn's Lemma is unnecessary, and a constructive proof may be given by repeating the first step of the proof a finite number of times.

Corollary B.41. *Let V be a normed vector space, with dual V*∗*, and let W* ⊂ *V be a linear subspace (inheriting the norm from V , with associated dual W\*). Then each* ϕ*<sup>W</sup>* ∈ *W*<sup>∗</sup> *has an extension* ϕ ∈ *V*<sup>∗</sup> *to V with the same norm.*

*Proof.* We take *p*(*v*) = ϕ*<sup>W</sup> v*, which clearly satisfies (B.109) - (B.110). If *V* is real, Theorem B.40 gives ϕ : *V* → R satisfying |ϕ(*v*)|≤ϕ*<sup>W</sup> v* for each *v* ∈ *V*, and hence ϕ≤ϕ*<sup>W</sup>* . But ϕ*<sup>W</sup>* ≤ϕ since *W* ⊂ *V*, hence ϕ = ϕ*<sup>W</sup>* ,

If *V* is complex, we first regard it as a real vector space, take the real part ϕ *<sup>W</sup>* of ϕ*<sup>W</sup>* , and isometrically extend ϕ *<sup>W</sup>* to a linear functional ϕ : *X* → R as above, so that ϕ = ϕ *<sup>W</sup>* . Then define ϕ : *X* → C by

$$
\mathfrak{g}(\nu) = \mathfrak{g}'(\nu) - i\mathfrak{g}'(i\nu). \tag{\mathbb{B}.121}
$$

One checks that ϕ((*s* + *it*)*v*)=(*s* + *it*)ϕ(*v*). Since ϕ (*v*) is the real part of ϕ(*v*), with |ϕ(*v*)| <sup>2</sup> <sup>=</sup> <sup>|</sup><sup>ϕ</sup> (*v*)| <sup>2</sup> <sup>+</sup>|<sup>ϕ</sup> (*iv*)| 2, we have <sup>|</sup><sup>ϕ</sup> (*v*)|≤|ϕ(*v*)| and hence ϕ ≤ϕ. Conversely, for any *v* with ϕ(*v*) = 0, take *z* = |ϕ(*v*)|/ϕ(*v*), so that |ϕ(*v*)| = ϕ(*zv*). Hence ϕ(*zv*) is real and therefore it is equal to its real part, so that, since |*z*| = 1,

$$\mathfrak{G}(z\nu) = \mathfrak{G}'(z\nu) \le ||\mathfrak{G}'|| ||z\nu|| = ||\mathfrak{G}'|| ||\nu||.$$

Therefore, ϕ≤ϕ , and hence ϕ = ϕ . The same computation applies to ϕ*<sup>W</sup>* , yielding ϕ*<sup>W</sup>* = ϕ *<sup>W</sup>* , so that finally ϕ = ϕ = ϕ *<sup>W</sup>* = ϕ*<sup>W</sup>* . -

In fact, this trick to pass from the real to the complex case was overlooked by Hanh and Banach themselves, whose arguments were much more involved.

As to Zorn's Lemma, if*V* is infinite-dimensional but still separable, using (countable) induction one may construct a sequence (*vn*) of linearly independent unit vectors in *V*\*W*, such that *V* is the closed linear span of *W* and the *vn*. The above procedure then gives ϕ in the real algebraic linear span of *W* and the *vn*, which is bounded by construction and may be extended to all of *V* by continuity. However, the construction of (*vn*) still requires a weaker form of the Axiom of Choice (which is equivalent to Zorn's Lemma), namely the so-called *Axiom of Dependent Choice*.

In the situation of Corollary B.41, the extension ϕ is unique iff the normed space *V* is *strictly convex*, which by definition means that its unit sphere is strictly convex, i.e., if *v* = *w* for *v* = *w* and *t* ∈ (0,1), then *tv*+ (1−*t*)*w* < 1. Equivalently, if *v* = *w* = <sup>1</sup> <sup>2</sup> *v*+*w*, then *v* = *w*. This is the case, for example, in Hilbert spaces *H*, as easily follows from the comment after (A.3). Indeed, anticipating Theorem B.66, if *W* ⊂ *H* is closed (as we may assume, since ϕ*<sup>W</sup>* is continuous), we may identify ϕ*<sup>W</sup>* : *W* → C with some vector ϕ*<sup>W</sup>* ∈ *W*, and if we do, the unique extension ϕ : *H* → C corresponds to the same vector ϕ*<sup>W</sup>* , now regarded as an element of *H*.

Corollary B.42. *Let V be a normed vector space, with dual V*∗*, and fix some nonzero vector v*<sup>0</sup> ∈ *V . There exists a functional* ϕ ∈ *V*<sup>∗</sup> *such that*

$$\mathfrak{G}(\nu\_0) = \|\nu\_0\|;\tag{B.122}$$

$$\|\!\!\|\!\!\!\|\!\!\|\!\!=\!1.\tag{\text{B.123}}$$

*Proof.* Take *W* = C· *v*<sup>0</sup> in Corollary B.41, so that ϕ*<sup>W</sup>* = 1 by construction. -

We now turn to an application of Theorem B.40 to convexity theory, which we will need for the Krein–Milman Theorem (and hence eventually for the existence of pure states on C\*-algebras). Although we will apply the lemma below to the dual of a normed vector space in its *w*∗-topology, the setting is more general; all we need is a few easily established facts for topological vector spaces *V*, namely that if *U* ⊂ *V* is open, then so is every translate *U* + *v* of *U*, and so is ε*U*, for any ε > 0, and hence also (−ε*U*)∩(ε*U*). Furthermore, a linear map ϕ : *V* → R is continuous iff it is continuous at 0. These elementary facts will be used in the proof below.

Theorem B.43. *Let V be a real topological vector space and let A and B be disjoint nonempty convex subsets of V , with A open. Then there is a continuous linear functional* ϕ :*V* → R *and some t* ∈ R *such that* ϕ(*a*) < *t* ≤ ϕ(*b*) *for all a* ∈ *A*,*b* ∈ *B.*

*Proof.* From *C* = *A* − *B* = {*a* − *b* | *a* ∈ *A*,*b* ∈ *B*}, which is convex and open (as it is a union of open sets *A* + *b* over *b* ∈ *B*). Then move *C* so that it contains 0, by taking any *a*<sup>0</sup> ∈ *A* and *b*<sup>0</sup> ∈ *B* and defining *K* = *C* +*v*0, with *v*<sup>0</sup> = *b*<sup>0</sup> −*a*0. Thus *K* has its associated Minkowski functional *pK*, cf. (B.111). Noting that *v*<sup>0</sup> ∈/ *K* (since *A*∩*B* = 0), we have / *pK*(*v*0) ≥ 1. With *W* = R · *v*0, define a functional ϕ*<sup>W</sup>* : *W* → R by ϕ*<sup>W</sup>* (*sv*0) = *s* for *s* ∈ R. This implies ϕ*<sup>W</sup>* (*v*) ≤ *pK*(*v*) for *v* ∈ R· *v*0: if *v* = *sv*<sup>0</sup> with *s* ≥ 0, this is obvious from (B.110) and ϕ*<sup>W</sup>* (*v*0) = 1, and if *s* < 0, then ϕ*<sup>W</sup>* (*v*) < 0 whereas *pK*(*v*) ≥ 0. We now use Theorem B.40 to extend ϕ*<sup>W</sup>* to a functional ϕ : *V* → R satisfying (B.114), which implies ϕ(*v*) ≤ *pK*(*v*) < 1 for any *v* ∈ *K*. Taking *v* = *a*−*b*+*v*<sup>0</sup> gives ϕ(*a*) < ϕ(*b*) for any *a* ∈ *A*,*b* ∈ *B*. Taking *t* = inf{ϕ(*b*) | *b* ∈ *B*}, the last claim of the lemma follows. Finally, since ϕ(*v*) < 1 for each *v* ∈ *K*, we have <sup>ϕ</sup>−1(−ε, <sup>ε</sup>) <sup>⊂</sup> (−ε*K*)∩(ε*K*), which is open, so that <sup>ϕ</sup> is continuous. -

This is the precise result we will need, but variations abound. If *A and B* are open, in which case ϕ(*B*) is open, we have ϕ(*a*) < *t* < ϕ(*b*). If *V* is *locally convex*, in that its topology has a basis consisting of convex sets, then if *A* is closed and *B* is compact, there are disjoint open convex sets *A* and *B* containing *A* and *B*, respectively, so that also in this case we obtain the strict inequalities just mentioned.

Finally, even if *V* has no topology, we can still show that ϕ(*a*) ≤ *t* ≤ ϕ(*b*) on the mere assumption that *A* has an interior point (ϕ then lacks continuity, of course).

Result like this are often called *separation theorems*. Namely, a plane *H* in R<sup>3</sup> always takes the form *<sup>x</sup>*<sup>0</sup> <sup>+</sup> ker<sup>ϕ</sup> <sup>=</sup> <sup>ϕ</sup>−1(*c*), where *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> <sup>R</sup><sup>3</sup> and <sup>ϕ</sup> : <sup>R</sup><sup>3</sup> <sup>→</sup> <sup>R</sup> is a (nonzero) linear map. Equivalently, *H* = ϕ−1(*c*), where *c* = ϕ(*x*0). More generally, a *hyperplane* in a vector space *V* is a (nonempty) subspace of the form *H* = ϕ−1(*c*), where ϕ is a linear functional on *V*; clearly, *H* has codimension one and if *V* is a topological vector space and ϕ is continuous, then *H* is closed. So Theorem B.43 shows that *A and B are separated by the closed hyperplane H* = ϕ−1(*t*)*.*

#### B.9 Duality

We now turn to duality theory. For any normed (but not necessarily complete) vector space *V*, Theorem B.33 shows that the space *V*<sup>∗</sup> of all morphisms ϕ : *V* → C is a Banach space, called the *dual* of *V*. By (B.99), the norm of ϕ ∈ *V*<sup>∗</sup> is given by

$$\|\|\boldsymbol{\varphi}\|\| = \sup\{ |\boldsymbol{\varphi}(\boldsymbol{\nu})|, \boldsymbol{\nu} \in V, \|\boldsymbol{\nu}\|\_{V} \le 1 \}. \tag{B.124}$$

Any morphism *a* ∈ *B*(*V*,*W*) induces a *dual morphism a*<sup>∗</sup> ∈ *B*(*W*∗,*V*∗) by

$$(a^\*\mathfrak{g})(\upsilon) = \mathfrak{g}(a\upsilon), \ \mathfrak{g} \in W^\*. \tag{\text{B.125}}$$

By definition of the various norms involved here, we find

$$\|\|a^\*\|\| = \sup\{ |\mathfrak{g}(av)|, \mathfrak{g} \in W^\*, v \in V, \|\|\mathfrak{g}\|\| = \|\|v\|\| = 1 \}.\tag{\text{B.126}}$$

Since |ϕ(*av*) ≤ ϕ*av*≤*a*, this immediately yields

$$\|a^\*\| \le \|a\|. \tag{\mathbf{B}.127}$$

In fact, one even has

$$\|\|a^\*\|\| = \|a\|,\tag{\text{B.128}}$$

but unexpectedly heavy machinery (namely the Hahn–Banach Theorem) is required to prove this. By Corollary B.42 (applied to *W*), for any *v* ∈ *V*, there exists ϕ ∈ *W*<sup>∗</sup> with ϕ = 1 and ϕ(*av*) = *av*, so from (B.126) we have *a*∗≥*av* for any *v* ∈*V* with *v* = 1. Taking the supremum over such *v* and using (B.99) gives *a*∗≥*a*. With our earlier (B.127), this gives (B.128).

Another application of Corollary B.42 lies in the *double dual V*∗∗ = (*V*∗)∗.

Proposition B.44. *For any normed space V , the map v* → *v from V to V* ˆ ∗∗*, given by*

$$
\hat{\nu}(\mathfrak{\mathfrak{g}}) = \mathfrak{\mathfrak{g}}(\nu), \ \mathfrak{g} \in V^\*, \tag{\mathbb{B}.129}
$$

*is isometric (and hence injective), mapping V onto a closed subspace <sup>V</sup>*<sup>ˆ</sup> <sup>⊆</sup> *<sup>V</sup>*∗∗*.*

This will follow from part 1 of the following consequence of Corollary B.42:

Corollary B.45. *Let V be a normed vector space, with dual V*∗

*1. For any v* ∈ *V , one has*

$$\|\|\nu\|\| = \sup\{ |\mathfrak{g}(\nu)|, \mathfrak{g} \in V^\*, \|\|\mathfrak{g}\|\| = 1 \}. \tag{\text{B.130}}$$


$$\|\|a\|\| = \sup\{ |\tau(a\mathbf{v})|, \mathbf{v} \in V, \tau \in W^\*, \|\|\mathbf{v}\| = \|\|\tau\|\| = 1 \}. \tag{\text{B.131}}$$

*Proof.* This is the proof of Corollary B.45.


*Proof.* And this is the proof of Proposition B.44. Note that *v*ˆ≤*v*, since

$$||\hat{\nu}|| = \sup\{ |\mathfrak{g}(\nu)|, \mathfrak{g} \in V^\*, ||\mathfrak{g}|| = 1 \},\tag{\text{B.132}}$$

and |ϕ(*v*)≤ϕ*v* = *v*. Corollary B.42 shows this bound is saturated. -

If *V* is finite-dimensional, Proposition B.44 gives a *natural* isomorphism *V*∗∗ ∼= *V*, in contrast with the "unnatural" isomorphisms *V*<sup>∗</sup> ∼= *V* that require the choice of a basis (this terminology is made precise in category theory, see Appendix E).

In addition to their (metric) topology coming from the norm, both *V* and *V*∗ naturally carry another topology (which will be of great importance in operator algebras and hence in quantum theory), defined in an almost identical way:


As their names suggest, these topologies are weaker than the norm topologies (except when *V* is finite-dimensional): indeed, if *vn* − *v* → 0 and ϕ ∈ *V*∗, then certainly |ϕ(*vn*) − ϕ(*v*)|≤ϕ*vn* − *v* → 0, and similarly for *V*∗. Consequently, a functional ϕ : *V* → C is norm-continuous if it is weakly continuous, but the converse may be false. Nonetheless, the weak dual of *V coincides* with its norm dual, and we combine this with a contrasting result for the weak∗ continuous functionals *V*∗, which *en passant* locates the image *V*ˆ of *V* in *V*∗∗ under (B.129):

Proposition B.46. • *Any functional* ϕ ∈ *V*<sup>∗</sup> *is weakly continuous.*

• *A functional* <sup>θ</sup> <sup>∈</sup> *<sup>V</sup>*∗∗ *is weak*<sup>∗</sup> *continuous iff* <sup>θ</sup> <sup>∈</sup> *V .* <sup>ˆ</sup>

We just mention that, because of Corollary B.45.2, this proposition is a special case of a very general result on topological vector spaces. Namely, let *V* and *W* two vector spaces in *separating duality*, that is, there is a bilinear form

$$
\langle -, - \rangle : \mathcal{V} \times \mathcal{W} \to \mathbb{C}
$$

such that for each *v*, *v* ∈ *V* there is *w* ∈ *W* with *v*,*w* = *v* ,*w*, and for each *w*,*w* ∈ *W* there is *v* ∈ *V* with *v*,*w* = *v*,*w* . Then *V* can be given the so-called σ(*V*,*W*)*-topology*, which is the weakest topology making each map *v* → *v*,*w* continuous (*w* ∈ *W*), and *W* likewise carries the σ(*W*,*V*) topology (sometimes also called the σ(*V*,*W*)-topology). In particular, the weak topology on *V* is just the σ(*V*,*V*∗)-topology, whereas the weak∗ topology on *V*∗ is the σ(*V*∗,*V*)-topology.

Theorem B.47. *Let V and W be vector spaces in separating duality. The space of* σ(*V*,*W*)*-continuous linear functionals on V coincides with W, and likewise, the space of* σ(*W*,*V*)*-continuous linear functionals on W coincides with V .*

This follows from elementary topology, and hence omit the proof. From this point of view, the apparent difference between the two parts of Proposition B.46 originates in the fact that the weak∗ topology on *V*∗ is defined by its separating duality with *V* (or, equivalently, with *V*), rather than its separating duality with *V*∗∗.

Next, the *Banach–Alaoglu Theorem* shows an unexpected but important property of the weak∗ topology (a least when *V* is infinite-dimensional). For example, in quantum theory this theorem implies *w*∗-compactness of the state space, and this, in turn (through the Krein–Milman Theorem), leads to an abundance of pure states.

Theorem B.48. *If V is a normed vector space, any d-ball*

$$V\_d^\* = \{ \mathfrak{g} \in V^\*, \|\mathfrak{g}\| \le d \} \tag{\mathbb{B}.133}$$

*is compact in the weak*∗ *topology. More generally, if U is any neighborhood of* 0 *in V , the set V*∗ *<sup>U</sup>* = {ϕ ∈ *V*∗,|ϕ(*x*)| ≤ *d* ∀*v* ∈ *U*} *is w*∗*-compact.*

Clearly, *U* = *V*<sup>1</sup> yields (B.133). Omitting the proof, we just note that the first claim is based on the fact that *V*∗ *<sup>U</sup>* is a closed subset of the space

$$\prod\_{\nu \in V} \{ z \in \mathbb{C} \mid |z| \le d \| \nu \| \},$$

which is compact by Tychonoff's Theorem in topology (such reliance on awful nonconstructive results is unfortunately typical of traditional functional analysis).

After this abstract theory, it is high time to turn to some examples; see Table B.1.


Table B.1 Some Banach spaces and their duals, up to isometric isomorphism


$$\mathcal{C}\_b(X) \cong \mathcal{C}(\mathcal{B}X);\tag{\mathbb{B}.134}$$

The compact Hausdorff space β*X* then has the feature that each *f* ∈ *Cb*(*X*) has a unique continuous extension to β*X*. More generally, let *X* be a topological space. Provided it exists, "the" *Cech–Stone compactification ˇ* of *X*, denoted by β*X*, is a compact Hausdorff space together with a continuous map β*<sup>X</sup>* : *X* → β*X* such that for each compact Hausdorff space *K* and each continuous function *f* : *X* → *K* there is a *unique* continuous function β *f* : β*X* → *K* such that the following diagram commutes:

$$X \xrightarrow{\beta\_X} \bigtimes\_{\mathsf{A}}^{\beta\_X} \bigtimes\_{\mathsf{B}}^{\infty} \tag{\mathsf{B}.135}$$

This universal property makes β*X* unique up to homeomorphism (if it exists). If *X* is locally compact Hausdorff, then β*X* exists and β*<sup>X</sup>* is injective, making β*<sup>X</sup>* (*X*) ∼= *X* a dense subspace of β*X*. The above diagram then implies (B.134) through *f* → β *f* ; just take *K* = Ran(*f*)−, which is compact since *f* is bounded. Specializing this case to arbitrary sets *X* seen as discrete topological spaces, we can give an explicit description of β*X* as the set of all ultrafilters on *X*.

Definition B.49. *Let X be any set (seen as a discrete topological space).*


For discrete *X*, the set of all ultrafilters on *X*, endowed with the topology generated by all sets of the form *UA* = {*U* ∈ β*X* | *A* ∈*U*}, where *A* ⊂ *X*, is a realization of the Cech–Stone compactification of ˇ *X*, and may therefore be denoted by β*X*. Note that each *UA* is clopen in β*X*. The embedding β*<sup>X</sup>* maps *x* ∈ *X* to the principal ultrafilter *Ux*, and the continuous extension β *f* of *f* : *X* → *K* is given by

B.9 Duality 549

$$\mathcal{B}f(U) \equiv \lim\_{U} f = \bigcap\_{A \in U} f(A)^{-}, \tag{B.136}$$

Theorem 4.24 then explains the pairing in no. 2 of Table B.1 (see also no. 5).

3. • This is a special case of no. 1, since -<sup>0</sup>(*X*) =*C*0(*X*), given that *X* is discrete (as a topological space). We then use the (Lebesgue–) *Radon–Nikodym Theorem* of measure theory: if (*X*,Σ,μ) is a σ-finite measure space and ν is a complex measure on Σ that is *absolutely continuous* with respect to μ (i.e., μ(*A*) = 0 implies <sup>ν</sup>(*A*) = 0, *<sup>A</sup>* <sup>∈</sup> <sup>Σ</sup>), then there is a function *<sup>d</sup>*ν/*d*<sup>μ</sup> <sup>∈</sup> *<sup>L</sup>*1(*X*) such that

$$\int\_{X} d\mathbf{v} \, f = \int\_{X} d\mu \, \frac{d\mathbf{v}}{d\mu} f, \,\, f \in L^{\infty}(X). \tag{\mathbf{B}.137}$$

In the case at hand, *X* is countable and μ is the counting measure, with respect to which any measure is absolutely continuous. This yields *M*(*X*) ∼= -<sup>1</sup>(*X*).

• Secondly, this duality is also a special case of Theorem B.27: as in (B.78),

$$\ell\_0(X) = \ell^\infty(X, \mathcal{P}\_f(X)),\tag{B.138}$$

so that bounded hermitian functionals ϕ : -<sup>0</sup>(*X*) → C (which in this case correspond to bounded real-linear functionals -<sup>0</sup>(*X*,R) → R) are given by

$$\mathfrak{g}(\mathfrak{g}) = \lim\_{n \to \infty} \int\_{A} d\mu \,\mathfrak{s}\_n,$$

where *g* ∈ -<sup>0</sup>(*X*), (*sn*) is a sequence in Step(*X*,P*f*(*X*)), which simply consists of functions on *X* with finite support, and μ is a finitely additive bounded signed measure on P*f*(*X*), which is given by its values on any singleton *x* ∈ *X* and hence is just a real-valued function

$$f(\mathbf{x}) = \mu(\{\mathbf{x}\});\tag{\mathbf{B}.139}$$

boundedness of μ gives *f* ∈ -<sup>1</sup>(*X*). Writing *<sup>X</sup>* <sup>=</sup> <sup>∪</sup>*nXn*, where the *Xn* are finite and *Xn* ⊂ *Xn*+<sup>1</sup> (e.g., for *X* = N one may take *Xn* = {1,...,*n*}, so that ∑*x*∈*Xn* = ∑*n <sup>x</sup>*=1), we may use *sn* = *f*|*Xn* on *Xn* and *sn*(*x*) = 0 outside *Xn*, which gives

$$\mathfrak{g}(\mathbf{g}) = \lim\_{n \to \infty} \sum\_{\mathbf{x} \in \overline{\mathbf{X}}\_n} f(\mathbf{x}) \mathbf{g}(\mathbf{x}) = \sum\_{\mathbf{x} \in \overline{\mathbf{X}}} f(\mathbf{x}) \mathbf{g}(\mathbf{x}). \tag{\mathbf{B}.140}$$

One easily verifies that indeed *f* <sup>1</sup> = μ, since (B.56) - (B.57) yield

$$\|\|\mu\|\| = \sup\{\mu\_+(A) + \mu\_-(A) \mid A \in \mathcal{P}\_f(X)\} = \sup\left\{\sum\_{\mathbf{x} \in A} |f(\mathbf{x})|, A \in \mathcal{P}\_f(X)\right\},$$

whose right-hand side in turn is equal to *f* 1.

• As a third approach, we give a direct proof of the desired duality

$$\ell\_0(X)^\* \cong \ell^1(X). \tag{B.141}$$

To start, for *f* ∈ -<sup>1</sup>(*X*) and *<sup>g</sup>* <sup>∈</sup> -<sup>0</sup>(*X*), we define an expression ϕ*f*(*g*) by

$$\mathfrak{G}\_f(\mathbf{g}) = \langle f, \mathbf{g} \rangle \equiv \sum\_{\mathbf{x} \in X} f(\mathbf{x}) \mathbf{g}(\mathbf{x}).\tag{\mathbf{B}.142}$$

By the obvious estimate

$$|\mathfrak{q}\_f(\mathfrak{g})| \le \|f\|\_1 \|\mathfrak{g}\|\_\infty,\tag{\mathbf{B}.143}$$

which is Holder's inequality for ¨ *p* = 1 and *q* = ∞, the sum (B.142) is absolutely convergent, and hence defines a linear map ϕ*<sup>f</sup>* : -<sup>0</sup>(*X*) → C, which satisfies ϕ*<sup>f</sup>* ≤ *f* 1. Thus the map *f* → ϕ*<sup>f</sup>* is well defined from -<sup>1</sup>(*X*) to -<sup>0</sup>(*X*)∗. To prove surjectivity of this map, for given ϕ ∈ -<sup>0</sup>(*X*)∗, define *f* : *X* → C by

$$f(\mathbf{x}) = \mathfrak{g}(\delta\_{\mathbf{x}}).\tag{\mathbf{B}.144}$$

It follows from continuity of ϕ that ϕ = ϕ*<sup>f</sup>* , cf. (B.140), but it remains to be shown that *f* ∈ -<sup>1</sup>(*X*). To do so, for each *<sup>n</sup>* <sup>∈</sup> <sup>N</sup> we define <sup>ϕ</sup>*<sup>n</sup>* : -<sup>0</sup>(*X*) → C by

$$\mathfrak{g}\_n(\mathbf{g}) = \sum\_{\mathbf{x} \in X\_0} f(\mathbf{x}) \mathbf{g}(\mathbf{x}).\tag{\mathbf{B}.145}$$

This operator is bounded, with

$$|||\!\!\!\!\!\!\!/\!\!\!\/) = ||\!\!\!\!\/\!\!\/]\_1,\tag{\mathbf{B}.146}$$

where *sn* was defined prior to (B.140). To see this, we have

$$||\!|\!|\!|\!g\_n\|\!| \le ||s\_n||\_1,\tag{B.147}$$

from (B.143), whereas the opposite inequality follows from a trick: define

$$g\_n(\mathbf{x}) = f(\mathbf{x}) / |f(\mathbf{x})| \text{ ( $\mathbf{x} \in X\_n, f(\mathbf{x}) \neq \mathbf{0}$ );}\tag{\mathbf{B}.148}$$

$$\mathbf{g}\_n(\mathbf{x}) = \mathbf{0} \text{ (otherwise)},\tag{\mathbf{B}.149}$$

so that, assuming ϕ = 0, we have *gn*<sup>∞</sup> = 1 and ϕ*n*(*gn*) = *sn*1, and hence

$$\|\|\mathfrak{g}\_n\|\| \geq \|\|\mathbf{s}\_n\|\|\_1. \tag{\mathbf{B}.150}$$

Since ϕ(*g*) = ϕ*f*(*g*) is finite by assumption, as in (B.140) lim*n*→<sup>∞</sup> ϕ*n*(*g*) exists for each *g* ∈ -<sup>0</sup>(*X*). Hence lim*n*→<sup>∞</sup> ϕ*n*(*g*) exists, so sup*n*{ϕ*n*(*g*)} < ∞. The Principle of Uniform Boundedness (cf. Theorem B.78 below) then gives sup*n*{ϕ*n*} < ∞, and this supremum equals sup*n*{*sn*} = *f* 1.

Comparing the first two approaches, we see that bounded *finitely additive* measures on P*f*(*X*) bijectively correspond to bounded σ*-additive* measures on P(*X*), both of which in turn are given by positive functions *f* ∈ -<sup>1</sup>(*X*).

4. This is similar to the third proof of the previous case. For *f* ∈ -<sup>∞</sup>(*X*) and *<sup>g</sup>* <sup>∈</sup> -<sup>1</sup>(*X*), we define ϕ*f*(*g*) by (B.142), and instead of (B.143) we now obtain

$$|\mathfrak{q}\_f(\mathfrak{g})| \le \|f\|\_{\infty} \|\mathfrak{g}\|\_{1}.\tag{\mathbb{B}.151}$$

Thus we have a map *f* → ϕ*<sup>f</sup>* from -<sup>∞</sup>(*X*) to -<sup>1</sup>(*X*)∗, satisfying ϕ≤ *f* ∞. To prove surjectivity, for some ϕ ∈ -<sup>1</sup>(*X*)<sup>∗</sup> we once again define *<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>C</sup> by (B.144), so that ϕ = ϕ*<sup>f</sup>* by continuity. Then for any *x* ∈ *X*,we obtain | *f*(*x*)| ≤ ϕδ*x*<sup>1</sup> = ϕ, so *f* <sup>∞</sup> ≤ ϕ and hence ϕ = *f* ∞. In particular, *f* ∈ -<sup>∞</sup>(*X*) and the bijection <sup>ϕ</sup>*<sup>f</sup>* <sup>↔</sup> *<sup>f</sup>* gives an isometric isomorphism a la (B.141): `

$$\ell^1(X)^\* \cong \ell^\infty(X). \tag{\mathbb{B}.152}$$

5. Similar to no. 3, this is a special case of two more general dualities, namely

$$\ell^{\infty}(X)^{\*} \cong \mathcal{M}(\mathcal{B}X);\tag{\mathbb{B}.153}$$

$$\ell^{\infty}(X,\mathbb{R})^{\*} \cong \mathrm{ba}(X,\mathcal{P}(X)),\tag{\mathbb{B}.154}$$

cf. no. 2, and Theorem B.27, respectively. Thus bounded *finitely additive* measures μ on *X* (with underlying semiring R = P(*X*)) bijectively correspond to bounded σ*-additive* measures μβ on β*X* (equipped with the Borel σ-algebra) by

$$
\int\_X d\mu \, f = \int\_{\beta X} d\mu\_{\beta} \, \mathfrak{B}f,\tag{\text{B.155}}
$$

for any *f* ∈ -<sup>∞</sup>(*X*). This is not as surprising as it seems, because there is a bijective correspondence between ultrafilters *U* on *X* and finitely additive probability measures μ on *X* that take values in {0,1}. This correspondence is given by:

$$U \;= \; \{ A \subset X \mid \mu(A) = 1 \}; \tag{B.156}$$

$$
\mu(A) = 1 \quad \text{iff } A \in U. \tag{B.157}
$$

Principal ultrafilters *Ux* thereby correspond to Dirac measures δ*<sup>x</sup>* on *X*, whereas free ultrafilters *U* correspond to (finitely additive) measures μ*<sup>U</sup>* on *X* that vanish on any finite subset of *X*. For general ultrafilters *U* ∈ β*X* we have, for *f* ∈ -<sup>∞</sup>(*X*),

$$\int\_{X} d\mu \, f = \bigcap\_{A \in U} f(A)^{-},\tag{B.158}$$

where *f*(*A*) = { *f*(*x*) | *x* ∈ *A*} as usual, and *f*(*A*)<sup>−</sup> is the closure of this set in C. Thus (B.158) is equal to the unique *z* ∈ C with the property that for each ε > 0 the set {*x* ∈ N : | *f*(*x*)−*z*| < ε} lies in *U*; for *U* = *Ux*, this recovers *z* = *f*(*x*).

6. This is similar to nos. 3 and 4, but is slightly more involved. For *f* ∈ *<sup>q</sup>*(*X*) and *g* ∈ *<sup>p</sup>*(*X*), with (B.16) and *<sup>p</sup>*,*<sup>q</sup>* <sup>=</sup> <sup>1</sup>,∞, we again define <sup>ϕ</sup>*f*(*g*) by (B.142), upon which Holder's inequality yields ¨ ϕ*<sup>f</sup>* ≤ *f q*. Conversely, for ϕ ∈ *<sup>p</sup>*(*X*)∗, once again define *f* by (B.144), so that ϕ = ϕ*<sup>f</sup>* . We now show that *f <sup>q</sup>* ≤ ϕ.

Pick *Xn* ⊂ *X* as defined below (B.139), and define *fn* : *X* → C by *fn*(*x*) = *f*(*x*) if *x* ∈ *Xn* and *fn*(*x*) = 0 if *x* ∈/ *Xn*. If *f <sup>q</sup>* < ∞, then *f <sup>q</sup>* = sup*<sup>n</sup> fnq*. Now define

$$\mathbf{g}\_n(\mathbf{x}) = |f\_n(\mathbf{x})|^q / f\_n(\mathbf{x}) \ (f\_n(\mathbf{x}) \neq \mathbf{0});\tag{\mathbf{B}.159}$$

$$\mathbf{g}\_n(\mathbf{x}) = \mathbf{0} \ (f\_n(\mathbf{x}) = \mathbf{0}).\tag{\text{B.160}}$$

Using (B.142), we obtain

$$\|\|f\_n\|\|\_q \|\|f\_n\|\|\_q^{q-1} = \|\|f\_n\|\|\_q^q = \langle f\_n, g\_n \rangle = \mathfrak{g}(g\_n) \le \|\|\mathfrak{g}\|\| \|g\_n\|\|\_p = \|\|\mathfrak{g}\|\| \|f\_n\|\|\_q^{q-1}, \text{ (B.161)}$$

whence *fn<sup>q</sup>* ≤ ϕ. Taking sup*<sup>n</sup>* gives *f <sup>q</sup>* ≤ ϕ, and hence

$$\ell^p(X)^\* \cong \ell^q(X). \tag{\mathbb{B}.162}$$


No. 9 follows from no. 8, whilst 10 and 11 are similar to 4 and 6, except for some tricky measure-theoretic details. We only sketch the main idea (where for simplicity we assume μ is finite; using an approximation procedure the result is valid also for the σ-finite case, but not beyond!). Namely, the function *f* representing the functional <sup>ϕ</sup> <sup>∈</sup> *<sup>L</sup>p*(*X*)<sup>∗</sup> is constructed by first defining a complex measure <sup>ν</sup> on <sup>Σ</sup> by ν(*A*) = ϕ(1*A*), *A* ∈ Σ. Using (B.85), we see that ν is absolutely continuous with respect to μ, and we put

$$f = d\mathbf{v}/d\mu. \tag{\mathbb{B}.163}$$

Using definition (B.29) of integration, this yields

$$
\mathfrak{g}(\mathfrak{g}) = \langle f, \mathfrak{g} \rangle = \int\_X d\mu \, f \mathfrak{g}, \tag{\text{B.164}}
$$

and similar arguments as in the discrete case show that *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>q*(*X*).

Nos. 12–13 follow from Theorem B.146 below, and no. 14, which is forwardlooking, too, is true by definition of the predual of a von Neumann algebra (whose *existence* is highly nontrivial); see Theorem C.132 in Appendix §C.

#### B.10 The Krein–Milman Theorem

Returning to the abstract theory, we now apply the Hahn–Banach Theorem and duality theory to prove one of the most beautiful results in functional analysis.

The *boundary* ∂*eK* of a convex set *K* consists of all *v* ∈ *K* satisfying:

$$\text{if } \mathbf{v} = t\mathbf{w} + (1 - t)\mathbf{x} \text{ for certain } \mathbf{w}, \mathbf{x} \in \mathbf{K} \text{ and } t \in (0, 1), \text{ then } \mathbf{v} = \mathbf{w} = \mathbf{x}.$$

Hence Caratheodory's Theorem 1.12, which, we recall, states that if *K* is a nonempty *compact* convex subset of <sup>R</sup>*n*, then <sup>∂</sup>*eK* <sup>=</sup> 0, and each point of / *<sup>K</sup>* is a convex sum of at most *n*+1 points in ∂*eK*, implies, in particular, that ∂*eK* is not empty. This is readily visualized: the simplest example is *K* = [0,1], where ∂*eK* = {0,1}. One also has triangles in the plane, whose boundaries consist of their vertices (rather than their sides, which are among their *faces*, see below). Furthermore, the *closed* (unit) three-ball *B*<sup>3</sup> in R<sup>3</sup> is convex, with boundary ∂*eB*<sup>3</sup> = *S*2, cf. Proposition 2.9. In these examples the interior of *K*, which is still convex, would have an empty boundary, so that the assumption of compactness in Theorem 1.12 is absolutely essential.

Caratheodory's Theorem follows from a straightforward induction argument in the dimension of *K*, and the following *Krein–Milman Theorem*. The *convex hull* co(*X*) of a subset *X* of a vector space is defined as the set of all convex sums *tx* + (1−*t*)*y*, where *t* ∈ (0,1) and *x*, *y* ∈ *X*; this is the smallest convex set containing *X*.

Theorem B.50. *Let V be a real normed vector space with dual V*∗*, and let K be a convex subset of V*<sup>∗</sup> *that is compact in the w*∗*-topology. Then* ∂*eK* = 0/*, and each point of K lies in the w*∗*-closure of the convex hull of* ∂*eK. In other words,*

$$K = \left(\text{co}(\hat{\partial}\_{\epsilon} K)\right)^{-}. \tag{B.165}$$

Zorn's Lemma will be used twice in the proof: both directly and through Theorem B.43, which relies on the Hahn–Banach Theorem B.40, whose proof uses Zorn. Furthermore, a *face* of a convex set *K* is a nonempty convex subset *F* ⊆ *K* such that:

$$If \, z = t\mathbf{x} + (1 - t)\mathbf{y} \, for \, z \in F \, with \, t \in (0, 1) \, and \, \mathbf{x}, \mathbf{y} \in \mathbf{K}, \, then \, \mathbf{x}, \mathbf{y} \in F.$$

In particular, each extreme point *x* ∈ ∂*eK* is a face in its own right; conversely, a face consisting of a single point lies in ∂*eK* (as should be clear from the definitions).

*Proof.* 1. Let F(*K*) be the set of all *closed* faces in *K*, partially ordered by *inverse* inclusion, i.e., *F*<sup>1</sup> *F*<sup>2</sup> iff *F*<sup>2</sup> ⊆ *F*1. The intersection of any finite subset of a totally ordered subset {*F*<sup>λ</sup> } of F(*K*) is obviously nonempty, so that, by compactness of *K*, we also have ∩λ*F*<sup>λ</sup> = 0. (Proof by contradiction: if / ∩λ*F*<sup>λ</sup> = 0, then / <sup>∪</sup>λ*F<sup>c</sup>* <sup>λ</sup> = (∩λ*F*<sup>λ</sup> )*<sup>c</sup>* <sup>=</sup> 0/ *<sup>c</sup>* <sup>=</sup> *<sup>K</sup>*, so that {*F<sup>c</sup>* <sup>λ</sup> } is an open cover of *K*, which definition of compactness has a finite subcover {*F<sup>c</sup>* <sup>λ</sup>}. By the same argument, ∩<sup>λ</sup>*F*<sup>λ</sup> = 0.) / Hence ∩λ*F*<sup>λ</sup> is an upper bound of {*F*<sup>λ</sup> }, so that Zorn gives us a (not necessarily unique) maximal element *F*<sup>0</sup> in F(*K*) (which set-theoretically is *minimal* because of the reverse ordering, i.e., *F*<sup>0</sup> contains no strictly smaller closed face). 2. We now show that *F*<sup>0</sup> must be a singleton (and hence an extreme point of *K*). For any *v* ∈*V*, the function ˆ*v* :*V*<sup>∗</sup> → R defined by ˆ*v*(ϕ) = ϕ(*v*) is *w*∗-continuous, see Propositions B.44 and B.46. Since *F*<sup>0</sup> ⊂ *K* is compact, ˆ*v* assumes a minimum on *F*0, say *m*. The set

$$F\_m = \{ \mathfrak{q} \in F\_0 \mid \mathfrak{i}(\mathfrak{q}) = m \} \tag{\mathbb{B}.166}$$

is not only closed (by continuity of ˆ*v*), and hence compact (since *F* is), but it is again a face in *K*: first, if ϕ ∈ *Fm* takes the form

$$
\mathfrak{q} = t\mathfrak{q}\_1 + (1 - t)\mathfrak{q}\_2,\tag{B.167}
$$

with ϕ1,ϕ<sup>2</sup> ∈ *F*0, then

$$
\hat{\nu}(\boldsymbol{\varphi}) = m = t\hat{\nu}(\boldsymbol{\varphi}\_1) + (1 - t)\hat{\nu}(\boldsymbol{\varphi}\_2), \tag{B.168}
$$

which, given that ˆ*v*(ϕ*i*) ≥ *m*, is only possible if ˆ*v*(ϕ1) = *v*ˆ(ϕ2) = *m*, so that ϕ*<sup>i</sup>* ∈ *Fm*. Hence *Fm* is a face in *F*0, but this implies that it is equally well a face in *K*. Namely, if (B.167) holds for ϕ ∈ *Fm* and ϕ*<sup>i</sup>* ∈ *K*, then regarding ϕ as an element of *F*<sup>0</sup> gives ϕ*<sup>i</sup>* ∈ *F*0, because *F*<sup>0</sup> is a face in *K*, upon which the previous step, where we regard ϕ as an element of *Fm*, gives ϕ*<sup>i</sup>* ∈ *Fm*.

Since *F*<sup>0</sup> is maximal, we must have *Fm* = *F*0, so that each functional ˆ*v* is constant on *F*0. Now we know (even without the Hahn–Banach Theorem) that the functionals ˆ*v* separate points in *V*∗, since the very statement that ϕ<sup>1</sup> = ϕ<sup>2</sup> means that there is some *v* ∈ *V* such that ϕ1(*v*) = ϕ2(*v*) and hence ˆ*v*(ϕ1) = *v*ˆ(ϕ2). So if *F*<sup>0</sup> contains more than one point, there must be a functional ˆ*v* that is not constant on *F*0. Hence *F*<sup>0</sup> is a singleton, and therefore an element of ∂*eK*. That is, ∂*eK* = 0./


$$B = (\text{co}(\hat{\partial}\_{\epsilon} K))^{-},\tag{B.169}$$

and assume *B* = *K*. First note that co(∂*eK*) is convex by construction, and that its closure *B* remains convex (because the vector space operations, and *a fortiori* the convex sums, are continuous). Its complement in *V*∗ is open, and hence any point α ∈ *K*\*B* has an open convex neighbourhood *A* ⊂*V*∗\*B* (see below), which is therefore disjoint from *B*. Hence Theorem B.43 applies (with *V V*<sup>∗</sup> and ϕ *v*ˆ), giving us *v* ∈ *V* and *t* ∈ R for which ˆ*v*(α) < *t* ≤ *v*ˆ(β) for any β ∈ *B*. Now define *s* = min{*v*ˆ(ϕ) | ϕ ∈ *K*}, which exists since *K* is *w*∗-compact and ˆ*v* is *w*<sup>∗</sup> continuous. Since α ∈ *K*\*B* ⊂ *K* and ˆ*v*(α) < *t*, we have *s* < *t*. Subsequently, define *Fs* = {ϕ ∈ *K* | *v*ˆ(ϕ) = *s*}. As in step 2 above, it follows that *Fs* is a closed face in *K*. According to step 3, there is a point ω ∈ *Fs* ∩∂*eK*, so that ˆ*v*(ω) = *s*. This contradicts *s* < *t* ≤ *v*ˆ(β) for any β ∈ *B*, as ω ∈ ∂*eK* ⊂ *B*. - The existence of *A* in step 4 above arises from the fact that open sets of the form

$$\mathcal{O}\_{\mathbb{V}\_1,\dots,\mathbb{V}\_n}^{(\varepsilon)} = \{ \mathfrak{q} \in V^\*, |\mathfrak{q}(\mathbb{v}\_l)| < \mathfrak{e}\ (i = 1,\dots,n) \},\tag{\mathbb{B}.170}$$

where ε > 0 and all *vi* ∈*V*, form a basis of *w*∗-neighbourhoods of 0 ∈*V*∗, and hence its translates <sup>ω</sup> <sup>+</sup> <sup>O</sup>(ε) *<sup>v</sup>*1,...,*vn* form such a basis for any <sup>ω</sup> <sup>∈</sup> *<sup>V</sup>*∗; the point is that such sets are convex, because if |ϕ*i*(*v*)| < ε for *i* = 1,2 and *t* ∈ (0,1), then

$$|(t\mathfrak{q}\_1 + (1-t)\mathfrak{q}\_2)(\nu)| \le t|\mathfrak{q}\_1(\nu)| + (1-t)|\mathfrak{q}\_2(\nu)| < (t+1-t)\varepsilon = \varepsilon. \tag{B.171}$$

Although the Krein–Milman Theorem is of considerable interest and beauty in itself, our main use of it lies in a few corollaries. Among those is Choquet's Theorem in the next section, but we first turn to the *Stone–Weierstrass Theorem*:

Theorem B.51. *Let X be a compact Hausdorff space. Let B be an involutive subalgebra of C*(*X*) *(regarded as a commutative C\*-algebra) that separates points on X (i.e., if x* = *y there is f* ∈ *B such that f*(*x*) = *f*(*y*)*) and contains the unit function* 1*<sup>X</sup> . Then B is dense in C*(*X*) *in the sup-norm. In particular, if B is closed, then B* =*C*(*X*)*.*

In other words, *B* is a linear subspace of *C*(*X*) such that if *f*,*g* ∈ *B*, then *f g* ∈ *B*, and if *f* ∈ *B*, then *f* <sup>∗</sup> ∈ *B*, where *f* <sup>∗</sup>(*x*) = *f*(*x*). Furthermore, *C*(*X*) and hence *B* are equipped with the sup-norm. The assumptions could even be weakened: instead of asking that 1*<sup>X</sup>* ∈ *B* and that *B* separate points, for the proof we just need that for each *x*, *y* ∈ *X* and *s*,*t* ∈ R there is *f* ∈ *B* such that *f*(*x*) = *s* and *f*(*y*) = *t*.

We are going to derive Theorem B.51 from Theorem B.50 and the following:

Lemma B.52. *Let B be a linear subspace of some Banach space V . Then B is dense in V iff the only element* ϕ ∈ *V*<sup>∗</sup> *that satisfies* ϕ(*v*) = 0 *for all v* ∈ *B is* ϕ = 0*.*

*Proof.* The "⇒" direction (which will not be needed) is immediate from the fact that ϕ ∈*V*<sup>∗</sup> is bounded and therefore, if *v* = lim*v*<sup>λ</sup> for (*v*<sup>λ</sup> ) in *B*, then ϕ(*v*) = limϕ(*v*<sup>λ</sup> ), so that ϕ(*v*) = 0 for all *v* ∈ *B* implies ϕ(*v*) = 0 for all *v* ∈ *V* and hence ϕ = 0.

Conversely, if *B*<sup>−</sup> = *V*, we will exhibit some nonzero ϕ ∈ *V*<sup>∗</sup> with ϕ|*<sup>B</sup>* = 0. Take some *w* ∈/ *B*<sup>−</sup> and define *W* ⊂ *V* by *W* = C·*w*+*B*−, along with a map ϕ*<sup>W</sup>* : *W* → C given by ϕ*<sup>W</sup>* (λ*w* + *v*) = λ for any λ ∈ C and *v* ∈ *B*−. This map is trivially linear, as well as bounded: since *w* ∈/ *B*<sup>−</sup> we have *w* − *v* ≥ *d* for some *d* > 0, for each *v* ∈ *B*−; since then also −*v* ∈ *B*−, we have λ*w*+*v*≥|λ|*d*, and therefore

<sup>|</sup>ϕ*<sup>W</sup>* (λ*w*+*v*)<sup>|</sup> <sup>=</sup> <sup>|</sup>λ| ≤ *<sup>d</sup>*−1λ*w*+*v*.

By Corollary B.41, our ϕ*<sup>W</sup>* extends to some ϕ ∈ *V*∗, with ϕ|*<sup>B</sup>* = ϕ*W*|*<sup>B</sup>* = 0. -

*Proof.* We now prove Theorem B.51. We define a subspace *<sup>B</sup>*<sup>0</sup> <sup>⊂</sup> *<sup>M</sup>*(*X*) by

$$\mathcal{B}^0 = \{ \mu \in \mathcal{M}(X) \mid \mu(f^\*) = \overline{\mu(f)}, \||\mu\|| \le 1, \mu(f) = 0 \,\forall f \in \mathcal{B} \},\tag{\text{B.172}}$$

where *f* ∗(*x*) = *f*(*x*) as usual. Our aim is to show that

$$B^0 = \{0\}. \tag{\mathbb{B}.173}$$

Since any ϕ ∈ *M*(*X*) is a multiple of some μ in the unit ball μ ≤ 1, eq. (B.173) gives the antecedent of the "⇐ part of Lemma B.52, which gives Theorem B.51.

Noting that the *w*∗-topology in *M*(*X*) is just the topology in which μλ → μ iff μλ (*f*) <sup>→</sup> <sup>μ</sup>(*f*) for each *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*), we see that *<sup>B</sup>*<sup>0</sup> is closed in the unit ball of *M*(*X*), so that it is *w*∗-compact by the Banach–Alaoglu Theorem. Furthermore, *B*<sup>0</sup> is convex, so the Krein–Milman Theorem gives <sup>∂</sup>*eB*<sup>0</sup> <sup>=</sup> 0. Any / <sup>μ</sup> <sup>∈</sup> <sup>∂</sup>*eB*<sup>0</sup> has either μ = 0, in which case (B.173) holds and we are ready, or, as we assume in what follows,

$$\|\mu\| = 1.\tag{\text{B.174}}$$

Indeed, if 0 < μ < 1, then

$$
\mu = t\mu\_1 + (1 - t)\mu\_2,\tag{B.175}
$$

with *t* = μ, μ<sup>1</sup> = μ/μ, and μ<sup>2</sup> = 0 would give a nontrivial decomposition of μ. For *g* ∈ *C*(*X*), define

$$L\_{\mathfrak{g}}: \mathcal{M}(X) \to \mathcal{M}(X);\tag{B.176}$$

$$L\_{\mathbf{g}}\mu(f) = \mu(\mathbf{g}f),\tag{\mathbf{B}.177}$$

or "*Lgd*μ = *g* · *d*μ". It follows from the assumptions on *B* in Theorem B.51 that if <sup>0</sup> <sup>&</sup>lt; *<sup>g</sup>* <sup>&</sup>lt; <sup>1</sup>*<sup>X</sup>* and *<sup>g</sup>* <sup>∈</sup> *<sup>B</sup>* (as we will now assume), then *Lg* maps *<sup>B</sup>*<sup>0</sup> into itself, and also <sup>0</sup> <sup>&</sup>lt; <sup>1</sup>*<sup>X</sup>* <sup>−</sup>*<sup>g</sup>* <sup>&</sup>lt; <sup>1</sup>*<sup>X</sup>* . Hence *<sup>L</sup>*1*X*−*<sup>g</sup>* maps *<sup>B</sup>*<sup>0</sup> into itself. Given (B.174), we then have

$$||L\_{1\chi - \mathfrak{g}}\mu|| = 1 - ||L\_{\mathfrak{g}}\mu||. \tag{B.178}$$

This follows from (B.76): the Hahn-Jordan decomposition (B.55) of μ also gives (*Lg*μ)<sup>±</sup> = *Lg*μ<sup>±</sup> and (*L*1*X*−*g*μ)<sup>±</sup> = *L*1*X*−*g*μ<sup>±</sup> (since *g* > 0 and 1*<sup>X</sup>* −*g* > 0), so that

$$||L\_{1\chi-g}\mu|| = L\_{1\chi-g}\mu\_{+}(X) + L\_{1\chi-g}\mu\_{-}(X) \tag{\text{B.179}}$$

$$\mu\_{-}=\mu\_{+}(X)+\mu\_{-}(X)-L\_{\mathfrak{g}}\mu\_{+}(X)-L\_{\mathfrak{g}}\mu\_{+}(X)=||\mu||-||L\_{\mathfrak{g}}\mu||.\quad(\text{B.180})$$

Because of (B.178), we obtain a convex decomposition (B.175) with *t* = *Lg*μ, μ<sup>1</sup> = *Lg*μ/*Lg*μ, and μ<sup>2</sup> = *L*1*X*−*g*μ/*L*1*X*−*g*μ, which are well defined because of (B.174), which guarantees that the two denominators are nonzero. Since μ is extreme by assumption (i.e., it lies in ∂*eB*0), it must be that

$$\frac{L\_{\mathfrak{s}}\mu}{||L\_{\mathfrak{s}}\mu||} = \frac{L\_{1\chi - \mathfrak{s}}\mu}{||L1\_X - \mathfrak{g}\mu||} = \mu. \tag{\text{B.181}}$$

Hence *g*(*x*) = *Lg*μ almost everywhere with respect to μ; in particular, this must hold for each *x* ∈ supp(μ). Suppose there are at least two different points *x*, *y* ∈ supp(μ). Since *B* separates points and contains 1*<sup>X</sup>* , we can easily find 0 < *g* < 1*<sup>X</sup>* such that *g*(*x*) = *g*(*y*), contradicting constancy of *g* on supp(μ). So supp(μ) = {*x*}, which, given (B.174), implies that μ = ±δ*x*, so that μ(1*<sup>X</sup>* ) = ±1. Since 1*<sup>X</sup>* ∈ *B*, this contradicts (B.172). Hence (B.174) leads to a contradiction, and we are left with the other possibility μ = 0. This gives μ = 0, that is, (B.173). -

#### B.11 Choquet's Theorem

*Choquet's Theorem* B.53 beautifully follows up on the Krein–Milman Theorem. To state it, we need the *support* supp(μ) of a measure μ on a space *X*, defined as the *smallest* closed set *F* such that μ(*X*\*F*) = 0, or, equivalently, as the *largest* closed set *F* such that each open neighbourhood *U* of each *x* ∈ *F* has strictly positive measure μ(*U*) > 0, provided such a set exists. This is the case, for example, if *X* is locally compact Hausdorff and μ is (inner) regular. To see this, let {*U*<sup>λ</sup> } be set of all open *U*<sup>λ</sup> ∈ O(*X*) such that μ(*U*<sup>λ</sup> ) = 0, and let *U* = ∪λ*U*<sup>λ</sup> . By inner regularity, μ(*U*) = sup{μ(*K*) | *K* ⊂ *U*,*K* ∈ K (*X*)}. Since each such *K* is compact, *<sup>K</sup>* ⊂ ∪*<sup>n</sup> <sup>i</sup>*=1*U*λ*<sup>i</sup>* , whence μ(*K*) ≤ ∑*<sup>i</sup>* μ(*U*λ*<sup>i</sup>* ) = 0. Hence μ(*U*) = 0, and supp(μ) = *X*\*U*.

Theorem B.53. *In the notation of Theorem B.50, for each* ϕ ∈ *K there is a probability measure* μ *on K whose support is contained in* ∂*eK*<sup>−</sup> *such that for each v* ∈*V ,*

$$\mathfrak{g}(\nu) = \int\_{\partial\_{\mathfrak{e}} K^{-}} d\mu(\mathfrak{o}) \, \mathfrak{o}(\nu). \tag{B.182}$$

*Moreover, if K is metrizable, then the support of* μ *may be restricted to* ∂*eK.*

Here ∂*eK*<sup>−</sup> ≡ (∂*eK*)<sup>−</sup> is the closure of ∂*eK*; in many examples (e.g., state spaces of C\*-algebras of infinite quantum systems), ∂*eK* is not closed or even Borel.

Reading (B.182) from right to left, the point ϕ ∈ *K* is called the *barycenter* of μ. Preparing for the proof, we note that if *X* is a compact Hausdorff space, the dual *C*(*X*)∗ of *C*(*X*) as a Banach space (in the sup-norm) is the space *M*(*X*) of all complete regular complex measures μ on *X*; cf. Theorem B.24. The set *M*<sup>+</sup> <sup>1</sup> (*X*) of all complete regular probability measures on *X* is a closed subset of the unit ball of *<sup>M</sup>*(*X*), since μ <sup>=</sup> <sup>μ</sup>(*X*) = 1 if <sup>μ</sup> <sup>∈</sup> *<sup>M</sup>*<sup>+</sup> <sup>1</sup> (*X*), cf. (B.54), and hence *<sup>M</sup>*<sup>+</sup> <sup>1</sup> (*X*) is *w*∗ compact by the Banach–Alaoglu Theorem. We will use these facts with *X* = ∂*eK*−.

We also recall that a (not necessarily continuous) function *f* : *K* → R is *affine* if

$$tf(t\mathfrak{q}\_1 + (1-t)\mathfrak{q}\_2) = tf(\mathfrak{q}\_1) + (1-t)f(\mathfrak{q}\_2),\tag{\text{B.183}}$$

for *t* ∈ (0,1) and ϕ<sup>1</sup> = ϕ<sup>2</sup> ∈ *K*, *concave* if one has ≥ instead of = in (B.183), *convex* with ≤ instead of =, and *strictly convex* if (B.183) holds with =<.

For example, *<sup>f</sup>*(*x*) = *<sup>x</sup>*<sup>2</sup> is strictly convex on [−1,1]. The assumption of metrizability will only be used to prove the existence of a strictly convex continuous function on *K*, so this existence could have been assumed instead of metrizability. Finally, we denote the space of real-valued continuous affine functions on *K* by *A*(*K*).

*Proof.* By Theorem B.50, ϕ = limϕλ , where (ϕλ ) is some net in co(∂*eK*), so that ϕλ = ∑*<sup>i</sup> p* (λ) *<sup>i</sup>* <sup>ω</sup>(λ) *<sup>i</sup>* , where the sum is finite, *p* (λ) *<sup>i</sup>* ≥ 0, and ∑*<sup>i</sup> p* (λ) *<sup>i</sup>* = 1. Then μλ = ∑*<sup>i</sup> p* (λ) *<sup>i</sup>* δ ω(λ) *i* is a probability measure on ∂*eK* and hence also on its (compact) closure ∂*eK*−. Since *M*<sup>+</sup> <sup>1</sup> (∂*eK*−) is *w*∗-compact, the previous net has a subnet that *<sup>w</sup>*∗-converges to some <sup>μ</sup> <sup>∈</sup> *<sup>M</sup>*<sup>+</sup> <sup>1</sup> (∂*eK*−). Noting that ϕ(*v*) = ϕˆ(*v*), where ˆ*v* ∈ *V*∗∗ is *w*∗-continuous by Proposition B.46, this μ by construction satisfies (B.182).

We now prove the last claim. If *K* is metrizable, then *C*(*K*) is separable, so that its subspace *A*(*K*) is separable, too. Thus we can find some countable dense subset (*fn*)*n*><sup>0</sup> of *A*(*K*), in terms of which we define a function *f*<sup>0</sup> : *K* → R by

$$f\_0(\boldsymbol{\upvarphi}) = \sum\_{n=1}^{\infty} 2^{-n} (||f\_n||\_{\infty} + 1)^{-2} |f\_n(\boldsymbol{\upvarphi})|^2. \tag{B.184}$$

First, continuity of *f*<sup>0</sup> follows from uniform convergence of this series and continuity of each *fn*; recall that *<sup>A</sup>*(*K*) <sup>⊂</sup>*C*(*K*,R). Second, the *<sup>x</sup>*<sup>2</sup> example just given implies that if *<sup>f</sup>* <sup>∈</sup> *<sup>A</sup>*(*K*), then *<sup>f</sup>* <sup>2</sup> is convex, and it is even strictly convex provided there is at least one *n* > 0 for which *fn*(ϕ1) = *fn*(ϕ2). To show that this is the case, we note that since *V* ⊂ *V*∗∗ separates points in *V*<sup>∗</sup> and each ˆ*v* ∈ *V*∗∗ defines an element of *A*(*K*) by restriction, *A*(*K*) separates points in *K*. Therefore, by density of the family (*fn*), the claim follows, and *f*<sup>0</sup> is strictly convex. This will be crucial.

For each real-valued *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*K*,R), define the *concave envelope* <sup>ˆ</sup>*<sup>f</sup>* by

$$\hat{f}(\boldsymbol{\upphi}) = \inf \{ \boldsymbol{g}(\boldsymbol{\upphi}) \mid \boldsymbol{g} \in A(K), \boldsymbol{g} \ge f \}. \tag{B.185}$$

The terminology comes from the fact that *<sup>f</sup>* <sup>≤</sup> <sup>ˆ</sup>*<sup>f</sup>* for any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*K*), with equality if *f* is concave; this is because for any continuous concave function *f* we may write

$$f(\mathfrak{g}) = \inf \{ g(\mathfrak{g}) \mid h \in A(K), \mathfrak{g} \ge f \}. \tag{B.186}$$

In terms of this, for any fixed element ϕ<sup>0</sup> ∈ *K* we define *p* : *C*(*K*,R) → R by

$$p(f) = \hat{f}(\mathfrak{q}\_0). \tag{B.187}$$

Since *<sup>f</sup>* <sup>+</sup>*<sup>g</sup>* <sup>≤</sup> <sup>ˆ</sup>*<sup>f</sup>* <sup>+</sup>*g*<sup>ˆ</sup> and *t f*. <sup>=</sup> *<sup>t</sup>* <sup>ˆ</sup>*<sup>f</sup>* for *<sup>t</sup>* <sup>≥</sup> 0, as is easily verified, it follows that *<sup>p</sup>* is sublinear (cf. Definition B.38). We define a linear subspace *W* ⊂ *C*(*K*,R) by

$$W = A(K) + \mathbb{R} \cdot f\_0,\tag{B.188}$$

endowed with the 'hatted' evaluation map ev. <sup>ϕ</sup><sup>0</sup> : *<sup>W</sup>* <sup>→</sup> <sup>R</sup> defined by

$$
\hat{\mathfrak{ev}}\_{\mathfrak{P}0}(\mathfrak{g} + \mathfrak{s}f\_0) = \mathfrak{g}(\mathfrak{q}\_0) + \mathfrak{s}\hat{f}\_0(\mathfrak{q}\_0);\tag{\mathsf{B}.189}
$$

since *<sup>g</sup>* <sup>=</sup> *<sup>g</sup>*<sup>ˆ</sup> for any *<sup>g</sup>* <sup>∈</sup> *<sup>A</sup>*(*K*), for *<sup>s</sup>* <sup>≥</sup> 0 we have ev. <sup>ϕ</sup><sup>0</sup> (*g*+*s f*0) = evϕ<sup>0</sup> (*g*ˆ+*s f*. <sup>0</sup>).

It is easy to show that *<sup>p</sup>* dominates ev. <sup>ϕ</sup><sup>0</sup> , so that the Hahn–Banach Theorem B.40 yields an extension ev. <sup>ϕ</sup><sup>0</sup> of ev. <sup>ϕ</sup><sup>0</sup> to *<sup>C</sup>*(*K*,R) that satisfies ev. <sup>ϕ</sup><sup>0</sup> (*f*) <sup>≤</sup> <sup>ˆ</sup>*f*(ϕ0). This implies that ev. <sup>ϕ</sup><sup>0</sup> is positive; to see this, take *f* ≤ 0. Since the zero function is in *<sup>A</sup>*(*K*) we have <sup>ˆ</sup>*<sup>f</sup>* <sup>≤</sup> 0 also, so that ev. <sup>ϕ</sup><sup>0</sup> (*f*) ≤ 0. Passing to −*f* , we find that ev. <sup>ϕ</sup><sup>0</sup> (*f*) ≥ 0 whenever *f* ≥ 0. Furthermore, since 1*<sup>K</sup>* ∈ *A*(*K*) ⊂ *W*, we have

$$
\hat{\mathbf{ev}}'\_{\mathfrak{p}\_0}(1\_K) = \hat{\mathbf{ev}}\_{\mathfrak{p}\_0}(1\_K) = 1\_K(\mathfrak{p}\_0) = 1.
$$

Therefore, ev. <sup>ϕ</sup><sup>0</sup> is a state on *<sup>C</sup>*(*K*). Corollary B.17 then turns ev. <sup>ϕ</sup><sup>0</sup> into a probability measure μ on *K*. Taking *f* = *v*ˆ for some *v* ∈ *V*, we have *f* ∈ *A*(*K*) ⊂ *W*, so that

$$\int\_{K} d\mu(\mathfrak{a}) \,\mathfrak{a}(\nu) \equiv \int\_{K} d\mu \,\hat{\nu} \equiv \mu(\hat{\nu}) = \hat{\mathfrak{e}} \mathbf{\hat{v}}\_{\mathfrak{0}0}(\hat{\nu}) = \hat{\mathfrak{v}}(\mathfrak{q}\_{0}) = \mathfrak{q}\_{0}(\nu). \tag{B.190}$$

This is almost (B.182) with ϕ ϕ0; what we still need to prove is the property

$$\text{supp}(\mu) \subseteq \partial\_{\epsilon} K. \tag{B.191}$$

This will be proved in two steps. For any *f* ∈ *C*(*K*), we define *K*(*f*) ⊂ *K* by

$$K(f) = \{ \mathfrak{g} \in K \mid f(\mathfrak{g}) = \hat{f}(\mathfrak{g}) \}. \tag{B.192}$$

We will separately show that

$$\text{supp}(\mu) \subseteq K(f\_0);\tag{B.193}$$

$$K(f\_0) \subseteq \partial\_\epsilon K. \tag{B.194}$$

Towards (B.193) we start showing that

$$
\mu(f\_0) = \mu(\hat{f}\_0),
\tag{B.195}
$$

which is a conjunction of <sup>μ</sup>(*f*0) <sup>≤</sup> <sup>μ</sup>( <sup>ˆ</sup>*f*0) and <sup>μ</sup>(*f*0) <sup>≥</sup> <sup>μ</sup>( <sup>ˆ</sup>*f*0). The first is true for any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*K*), since <sup>μ</sup> is positive and *<sup>f</sup>* <sup>≤</sup> <sup>ˆ</sup>*<sup>f</sup>* (pointwise). The second is specific to *<sup>f</sup>*0:

$$\begin{split} \mu(f\_0) &= \hat{\text{ev}}'\_{\mathfrak{g}\_0}(f\_0) = \hat{\text{ev}}\_{\mathfrak{g}\_0}(f\_0) = \hat{f}\_0(\mathfrak{g}\_0) \\ &= \inf \{ \mathfrak{g}(\mathfrak{g}\_0) \mid \mathfrak{g} \in A(K), \mathfrak{g} \ge f\_0 \} \\ &= \inf \{ \mu(\mathfrak{g}) \mid \mathfrak{g} \in A(K), \mathfrak{g} \ge f\_0 \}, \end{split} \tag{B.196}$$

since for *g* ∈ *A*(*K*) we have *g*(ϕ0) = μ(*g*) because *A*(*K*) ⊂ *W*. If in addition *g* ≥ *f*0, we have *<sup>g</sup>* <sup>≥</sup> <sup>ˆ</sup>*f*0, which implies <sup>μ</sup>(*g*) <sup>≥</sup> <sup>μ</sup>( <sup>ˆ</sup>*f*0). This inequality survives the infimum in (B.196), so that we finally obtain <sup>μ</sup>(*f*0) <sup>≥</sup> <sup>μ</sup>( <sup>ˆ</sup>*f*0), and hence (B.195).

We now prove (B.193) from (B.195). Since *<sup>f</sup>*<sup>0</sup> <sup>≤</sup> <sup>ˆ</sup>*f*0, for each *<sup>n</sup>* <sup>&</sup>gt; 0 we may define

$$K\_n = \{ \mathfrak{p} \in K \mid \hat{f}\_0(\mathfrak{p}) - f\_0(\mathfrak{p}) \ge 1/n \}. \tag{B.197}$$

Then 0 ≤ μ(*Kn*) ≤ *n*· *<sup>K</sup> <sup>d</sup>*<sup>μ</sup> ( <sup>ˆ</sup>*f*<sup>0</sup> <sup>−</sup> *<sup>f</sup>*0), which vanishes by (B.195). Hence <sup>μ</sup>(*Kn*) = <sup>0</sup> for each *<sup>n</sup>*, and therefore <sup>μ</sup>(∪*nKn*) = 0. But <sup>∪</sup>*nKn* <sup>=</sup> *<sup>K</sup>*(*f*0)*c*, so (B.193) follows.

Eq. (B.194) is equivalent to the inclusion (∂*eK*)*<sup>c</sup>* <sup>⊆</sup> *<sup>K</sup>*(*f*0)*c*, i.e., the implication:

$$\text{if } \mathfrak{g} = t\mathfrak{q}\_1 + (1 - t)\mathfrak{q}\_2 \text{ for some } t \in (0, 1) \text{ and } \mathfrak{q}\_1 \neq \mathfrak{q}\_2 \text{, then } \mathfrak{f}\_0(\mathfrak{q}) \neq f\_0(\mathfrak{q}).$$

Indeed, strict convexity of *<sup>f</sup>*<sup>0</sup> (used at last!) and the familiar property *<sup>f</sup>*<sup>0</sup> <sup>≤</sup> <sup>ˆ</sup>*f*<sup>0</sup> give

$$\begin{aligned} \widehat{f}\_0(\mathfrak{g}) &= \inf \{ tg(\mathfrak{g}\_1) + (1-t)g(\mathfrak{g}\_2) \mid \mathfrak{g} \in A(K), \mathfrak{g} \ge f\_0 \} \\ &\ge t \inf \{ g(\mathfrak{g}\_1) \mid \mathfrak{g} \in A(K), \mathfrak{g} \ge f\_0 \} + (1-t) \inf \{ g(\mathfrak{g}\_2) \mid \mathfrak{g} \in A(K), \mathfrak{g} \ge f\_0 \} \\ &= t \widehat{f}\_0(\mathfrak{g}\_1) + (1-t) \widehat{f}\_0(\mathfrak{g}\_2) \ge tf\_0(\mathfrak{g}\_1) + (1-t)f\_0(\mathfrak{g}\_2) > f\_0(\mathfrak{g}). \end{aligned}$$

In turn, the existence of some measure μ in (B.182) representing an arbitrary point ϕ ∈ *K* implies the Krein–Milman Theorem. We rewrite (B.182) as

$$\mathfrak{d}(\mathfrak{q}) = \int\_{\partial\_{\mathfrak{e}}K^{-}} d\mathfrak{\mu} \,\mathfrak{v},\tag{\text{B.198}}$$

where ϕ ∈ *K* is arbitrary and ˆ*v* ∈ *C*(*K*) is the (affine) continuous function on *K* ⊂ *V*<sup>∗</sup> induced by the functional ˆ*v* ∈ *V*∗∗ on *V*<sup>∗</sup> defined by *v* ∈ *V* under the canonical injection *V* →*V*∗∗, *v* → *v*ˆ, see Proposition B.44. From (B.198) and (B.34) we obtain

$$|\mathfrak{v}(\mathfrak{q})| \le ||\mathfrak{v}||\_{\infty}^{(\partial\_{\mathfrak{e}}K^{-})},$$

which, because ∂*eK*<sup>−</sup> ⊂ (co(∂*eK*))−, also gives the inequality

$$|\widehat{\nu}(\mathfrak{q})| \le \|\widehat{\nu}\|\_{\infty}^{((\mathrm{co}(\mathfrak{q}\_\epsilon \mathbf{K}))^{-})}.$$

This forces ϕ ∈ (co(∂*eK*))−, for if ϕ ∈/ (co(∂*eK*))<sup>−</sup> we would obtain a contradiction with Theorem B.43 (which is a version of the Hanh–Banach Theorem), or more precisely, with the alternative version thereof stated after its proof, with *A* = {ϕ} closed and *B* = (co(∂*eK*))<sup>−</sup> compact and convex (and, of course, ϕ −ϕ). Therefore, *K* ⊆ (co(∂*eK*))−, which implies (B.165).

If only to illustrate Choquet's Theorem, we note that existence of the probability measure μ in the Riesz Representation Theorem B.15 follows from it. To see this, fix some compact Hausdorff space *X*, and take *V* =*C*(*X*,R) (as a real Banach space in the supremum-norm) and *K* = *S*(*C*(*X*,R)) ⊂ *V*∗, i.e., the set of positive linear functionals ϕ : *C*(*X*,R) → R that satisfy ϕ(1*<sup>X</sup>* ) = 1. By the argument following Definition 1.14, *K* coincides with the state space *S*(*C*(*X*)) of the commutative C\* algebra*C*(*X*), which is a *complex* Banach space (cf. Appendix C), in that each ϕ ∈ *K* extends uniquely to a state ϕ : *C*(*X*) → C by complex linearity, which extension remains positive in the sense of Definition C.3. From Propositions C.14 and C.19, the map *X* → *V*<sup>∗</sup> given by *x* → ev*x*, where ev*x*(*f*) = *f*(*x*) is the evaluation map at *x*, takes values in ∂*eK* and yields a homeomorphism

$$
\partial\_{\epsilon} K \cong X. \tag{B.199}
$$

In particular, ∂*eK* is closed in *V*<sup>∗</sup> (and in *K*), so (B.182) comes down to (B.39).

The part of Theorem B.15 that does not follow from Theorem B.53 is the possible uniqueness of the measure μ on ∂*eK*<sup>−</sup> that represents the point ϕ ∈ *K*. Uniqueness of the measure in Choquet's Theorem is settled by the following notion.

Definition B.54. *A* (Choquet) simplex *is a compact convex set K* ⊂ *V*<sup>∗</sup> *whose associated convex cone <sup>K</sup>*˜ <sup>=</sup> <sup>R</sup><sup>+</sup> ·*<sup>K</sup>* ≡ {*t*<sup>ω</sup> <sup>|</sup> *<sup>t</sup>* <sup>≥</sup> <sup>0</sup>,<sup>ω</sup> <sup>∈</sup> *<sup>K</sup>*} *(cf. Definition C.50) is a lattice in the partial ordering* <sup>≤</sup> *defined by* <sup>ρ</sup> <sup>≤</sup> <sup>σ</sup> *iff* <sup>σ</sup> <sup>−</sup><sup>ρ</sup> <sup>∈</sup> *<sup>K</sup>*˜

Here we assume that for any <sup>ρ</sup> <sup>∈</sup> *<sup>K</sup>*˜ there is a unique *<sup>t</sup>* <sup>∈</sup> <sup>R</sup><sup>+</sup> and <sup>ω</sup> <sup>∈</sup> *<sup>K</sup>* such that *<sup>t</sup>*<sup>ω</sup> <sup>=</sup> <sup>ρ</sup>; this is the case if *<sup>K</sup>* <sup>=</sup> *<sup>K</sup>*˜ <sup>∩</sup>*<sup>H</sup>* for some closed hyperplane *<sup>H</sup>* in *<sup>V</sup>*<sup>∗</sup> that does not contain the origin. For example, if *K* = *S*(*A*) is the state space of some unital C\*-algebra *<sup>A</sup>*, then *<sup>H</sup>* <sup>=</sup> {<sup>ϕ</sup> <sup>∈</sup> *<sup>A</sup>*<sup>∗</sup> <sup>|</sup> <sup>ϕ</sup>(1*A*) = <sup>1</sup>} and *<sup>K</sup>*˜ <sup>=</sup> {<sup>ϕ</sup> <sup>∈</sup> *<sup>A</sup>*<sup>∗</sup> <sup>|</sup> <sup>ϕ</sup> <sup>≥</sup> <sup>0</sup>}).

In finite dimension, Choquet simplices are special convex polytopes called *simplices*. Recall that the so-called *regular polyhedra* were classified (up to affine isomorphism) by Schlafli in 1852, who showed that the only possibilities are: ¨


An *n*-dimensional simplex is affinely homeomorphic to the convex hull of *n*+1 *linearly independent* points (or, equivalently, |∂*eK*| = *n*+1). In particular, the simplex Δ*<sup>n</sup>* is the set Pr(*n*+1) of all probability distributions on a set *X* = *n*+1 of cardinality *n*+1, cf. Definition 1.9. Generalizing this idea, if *X* is a compact Hausdorff space, then the state space *S*(*C*(*X*)) of the associated commutative C\*-algebra*C*(*X*), which as we know consists of all probability measures on *X*, is a Choquet simplex.

In the notation of Theorem B.53, the simplest result (again due to Choquet) is:

#### Theorem B.55. *Suppose K is metrizable, and assume* supp(μ) ⊆ ∂*eK in* (B.182)*. Then* μ *is uniquely determined by its barycenter* ϕ *iff K is a Choquet simplex.*

However, we note that without any assumption on *K*, conversely the barycenter ϕ for which (B.182) holds for all *v* ∈*V* is uniquely determined by μ. This observation gives rise to a map *B* from the compact convex set *M*(*K*) + <sup>1</sup> of all probability measures on *K* to *K* itself, such that *B*(μ) is the unique point in *K* such that (B.198) with ϕ = *B*(μ) holds for all *v* ∈ *V*. This map *B* is, in fact, affine as well as continuous.

Theorem B.55 covers finite phase spaces in classical mechanics as well as, negatively, finite-dimensional Hilbert spaces in quantum mechanics: in the former case, any state admits a unique decomposition into pure states (cf. Proposition 1.13), whereas in the latter this fails. For example, for *H* = C2, the state space *S*(*B*(*H*)) ∼= *B*<sup>3</sup> (see Proposition 2.9) is not a simplex. See also Proposition 2.14.

To explain the general (i.e., non-metrizable) case, we first define the *Choquet ordering* ≺ on the set of probability measures on *K* by μ ≺ ν iff μ(*f*) ≤ ν(*f*) for any *convex* function *f* ∈ *C*(*K*,R). Noting that *B*(μ) = *B*(ν) whenever μ ≺ ν, the idea is that since the values of convex functions almost by definition increase towards the boundary ∂*eK*, probability measures on *K* with given barycenter that are maximal with respect to ≺ should be supported on ∂*eK* (such maximal measures always exist by a Zorn's Lemma argument). This intuition is indeed correct, *provided K is metrizable*, in which case, conversely, the condition supp(μ) ⊆ ∂*eK* in Theorem B.55 forces μ to be maximal. In general, an alternative way to prove the first part of Theorem B.53 would be to take some maximal μ with given barycenter μ.

The key to the generalization of Theorem B.55 to the possibly non-metrizable case, then, is to replace the assumption supp(μ) ⊆ ∂*eK* by maximality of μ. This is achieved by the major *Choquet–Meyer Theorem*, which we state without proof:

Theorem B.56. *Assume the measure* μ *in* (B.182) *is maximal with respect to* ≺*. Then* μ *is uniquely determined by its barycenter* ϕ *iff K is a Choquet simplex.*

#### B.12 A precis of infinite-dimensional Hilbert space ´

The main difference between infinite-dimensional Hilbert spaces and their finitedimensional counterparts lies in issues of convergence and completeness. Every linear subspace of a finite-dimensional Hilbert space is automatically complete (cf. Proposition B.5), and all sums one encounters are finite. In infinite dimension, *<sup>c</sup>*(N) is a linear but incomplete subspace of -<sup>2</sup>(N), and similarly for *Cc*(R) <sup>⊂</sup> *<sup>L</sup>*2(R); the expansion of some vector in terms of a basis already involves an infinite sum.

Note that in metric spaces a subset is closed iff it is sequentially complete (in that it contains all limits of Cauchy sequences); this can be seen from the fact that the metric topology is generated by ε-balls and hence by (1/*n*)-balls, *n* ∈ N. Consequently, in Banach spaces (and hence in Hilbert spaces) *H*, the property of some subspace *L* ⊂ *H* being (metrically) *complete* (in the sense that every Cauchy sequence in *L* converges to an element of *L*) is the same as *L* being (topologically) closed (in the sense that the set-theoretic complement *L<sup>c</sup>* is open). Following tradition in functional analysis, we will henceforth speak of *closed* subspaces. We denote the (metric or topological) closure of *S* ⊂ *H* in *H* by *S*−.

An exhaustive way of guaranteeing that some linear subspace *L* ⊂ *H* is closed is to exhibit it as an *orthogonal complement L* = *S*⊥, where *S* ⊂ *H* is *any* subset: we write ψ ⊥ *S* iff χ,ψ = 0 for each χ ∈ *S*, and, as in (A.29), put

$$\mathbb{S}^{\perp} = \{ \Psi \in H \mid \Psi \perp S \}. \tag{\mathbb{B}.200}$$

We also use the *double orthogonal complement S*⊥⊥ ≡ (*S*⊥)⊥, *et cetera*.

Proposition B.57. *Let H be a Hilbert space.*

*1. If S* ⊂ *H is any subset, S*<sup>⊥</sup> *is a closed linear subspace of H. 2. For each* closed *linear subspace L* ⊂ *H, one has*

$$H = L \oplus L^{\perp},\tag{B.201}$$

*in the sense that*

$$L \cap L^{\perp} = \{0\},\tag{\text{B.202}}$$

*and each vector* ψ ∈ *H has a* unique *decomposition*

$$
\Psi = \Psi^{\parallel} + \Psi^{\perp},
\tag{B.203}
$$

*where* ψ ∈ *L and* ψ<sup>⊥</sup> ∈ *L*⊥*.*


$$L^{\perp \perp} = L^-,\tag{\text{B.204}}$$

*and hence L*<sup>−</sup> = *H iff* ψ,ϕ = 0 *for each* ϕ ∈ *L implies* ψ = 0*.*


*Proof.* 1. Linearity of *S*<sup>⊥</sup> follows from linearity of the inner product. If ψ*<sup>n</sup>* ∈ *S*<sup>⊥</sup> and ψ*<sup>n</sup>* → ψ, then for χ ∈ *S* and each *n*, we have

$$|\langle \mathcal{X}, \Psi \rangle| = |\langle \mathcal{X}, \Psi - \Psi\_n \rangle| \le ||\mathcal{X}|| ||\Psi - \Psi\_n||. \tag{B.205}$$

Taking *n* → ∞ gives χ,ψ = 0 and hence ψ ∈ *S*⊥, so that *S*<sup>⊥</sup> is closed.

2. The proof of the infinite-dimensional case (cf. Corollary A.9 for finite dimension) relies on *Riesz Lemma* B.58 below, which explains why *L* needs to be *closed*, and also neatly identifies ψ as the unique vector in *L* at minimal distance to ψ. Granting this important lemma, let ψ ∈ *H*, we take

$$C = \Psi + L \equiv \{\Psi + \Phi, \Phi \in L\}. \tag{B.206}$$

Lemma B.58 yields a unique vector χ<sup>0</sup> ∈ *C*, from which we define ψ = ψ − χ<sup>0</sup> and ψ<sup>⊥</sup> = χ<sup>0</sup> (so that ψ −ψ = χ0 is minimal). Then ψ ∈ *L*, and (B.203) holds by construction. To show that χ<sup>0</sup> ∈ *L*⊥, we rewrite the inequality χ0 ≤ ψ + ϕ (for all ϕ ∈ *L*) as χ0≤χ<sup>0</sup> + ϕ, since ψ = χ<sup>0</sup> + ψ and ψ ∈ *L*. Putting <sup>ϕ</sup> <sup>=</sup> <sup>−</sup>(<sup>ζ</sup> ,χ0/<sup>ζ</sup> 2)<sup>ζ</sup> , with <sup>ζ</sup> <sup>∈</sup> *<sup>L</sup>* arbitrary (but nonzero), the last inequality reads 0 ≤ −|<sup>ζ</sup> ,χ0|2/<sup>ζ</sup> 2, whence <sup>ζ</sup> ,χ0 <sup>=</sup> 0 for all <sup>ζ</sup> <sup>∈</sup> *<sup>L</sup>*, so that χ<sup>0</sup> ∈ *L*⊥. Uniqueness of the decomposition (B.203) follows as in Corollary A.9.


Lemma B.58. *The norm assumes a unique minimum on any closed convex set C* ⊂ *H (i.e., there is a unique* χ<sup>0</sup> ∈ *C such that* χ0 < χ *for each* χ ∈ *C,* χ = χ0*).*

*Proof.* Let μ = inf{χ,χ ∈ *C*}, which exists, as χ ≥ 0. Hence there is a minimizing sequence (χ*n*) in *C* with χ*n* → μ, which we now prove to be Cauchy (in *H*). Since *C* is convex, <sup>1</sup> <sup>2</sup> (χ*<sup>n</sup>* + χ*m*) ∈ *C*, and therefore, χ*<sup>n</sup>* + χ*m* ≥ 2μ. Thus

$$0 \le ||\mathfrak{X}\_n - \mathfrak{X}\_m||^2 = 2(||\mathfrak{X}\_n||^2 + ||\mathfrak{X}\_m||^2) - ||\mathfrak{X}\_n + \mathfrak{X}\_m||^2 \le 2(||\mathfrak{X}\_n||^2 + ||\mathfrak{X}\_m||^2) - 4\mu^2,$$

and since 2(χ*n*<sup>2</sup> <sup>+</sup> χ*m*2) <sup>→</sup> <sup>4</sup>μ<sup>2</sup> as *<sup>n</sup>*,*<sup>m</sup>* <sup>→</sup> <sup>∞</sup>, we must have χ*<sup>n</sup>* <sup>−</sup> <sup>χ</sup>*m* → 0. Since *C* is closed, χ*<sup>n</sup>* → χ<sup>0</sup> for some χ<sup>0</sup> ∈*C*. To prove uniqueness, let another minimizing sequence (χ *<sup>n</sup>*) converge to χ <sup>0</sup> ∈ *C*. Then <sup>1</sup> <sup>2</sup> (χ<sup>0</sup> + χ <sup>0</sup>) ∈ *C*, so we obtain

$$\|\mathbf{x}^{\mathbf{o}} + \mathbf{x}^{\mathbf{o}}\_{0}\| \ge 2\mu = \|\mathbf{x}^{\mathbf{o}}\| + \|\mathbf{x}^{\mathbf{o}}\_{0}\|.$$

The inequality χ<sup>0</sup> + χ <sup>0</sup>≤χ0 + χ <sup>0</sup> gives χ<sup>0</sup> + χ <sup>0</sup> = χ0 + χ <sup>0</sup>, i.e. Reχ <sup>0</sup>,χ0 = χ <sup>0</sup>χ0. Cauchy–Schwarz gives |χ 0,χ0| ≤ χ <sup>0</sup>χ0 with equality iff χ <sup>0</sup> and χ<sup>0</sup> are proportional, so the previous equality can hold only if χ <sup>0</sup> = *t*χ<sup>0</sup> for some *t* ≥ 0. Since χ <sup>0</sup> and χ<sup>0</sup> both minimize the norm, we have *t* = 1. -

We now turn to the important concept of a *basis* of a Hilbert space; as in the previous appendix, *a basis of a Hilbert space always denotes an orthonormal basis*. To define this notion, we first say that some subset {υ*i*}*i*∈*<sup>I</sup>* of *H* is *orthonormal* if

$$
\langle \mathfrak{v}\_{i}, \mathfrak{v}\_{j} \rangle = \mathfrak{d}\_{ij}; \tag{\text{B.207}}
$$

this condition guarantees that the υ*<sup>i</sup>* are linearly independent (and easy to calculate with!). Second, in finite dimension (where *I* must be finite) we may simply define a basis of *H* as an orthonormal set that is also a basis in the usual (linear algebra) sense. This idea remains valid for general Hilbert spaces, except that we should use Definition B.6 to define infinite sums (and Lemma B.7 to analyze them). Theorem B.61 to come gives an exhaustive account of the situation, but we first need a lemma on general orthonormal sets (that do not necessarily form a basis).

Lemma B.59. *If* {υ*i*}*i*∈*<sup>I</sup> is an orthonormal set in H and ci* ∈ C*, then the sum*

$$\Psi = \sum\_{i \in I} c\_i \mathcal{U}\_i \tag{B.208}$$

*converges in H (in the sense of Definition B.6) iff*

$$\sum\_{i \in I} |c\_i|^2 < \ast \text{.} \tag{B.209}$$

*If this is the case, the coefficients ci* ∈ C *are given by*

$$c\_i = \langle \mathfrak{v}\_i, \mathfrak{v} \rangle. \tag{\text{B.210}}$$

*Proof.* The first claim follows from Proposition B.8 and the elementary computation

$$\|\|\sum\_{i\in G'} c\_i \mathfrak{d}\_i\|\|^2 = \sum\_{i\in G'} ||c\_i \mathfrak{d}\_i||^2 = \sum\_{i\in G'} |c\_i|^2 < \mathfrak{e},\tag{\text{B.211}}$$

where *G* is finite, so that the sums ∑*i*∈*<sup>I</sup> ci*υ*<sup>i</sup>* and ∑*i*∈*<sup>I</sup>* |*ci*| <sup>2</sup> either both exist (i.e., converge) or both do not exist. When *I* is countable this follows more simply by noting that <sup>∑</sup>*i*∈<sup>N</sup> *ci*υ*<sup>i</sup>* converges iff (*sn*) is a Cauchy sequence, where *sn* <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *ci*υ*i*, and computing *sn* <sup>−</sup> *sm*<sup>2</sup> <sup>=</sup> <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*=*m*+<sup>1</sup> |*ci*| 2, where *n* > *m*. To prove (B.210) on the assumption that (B.208) exists, by the Cauchy–Schwarz inequality, for any ε > 0,

$$\begin{aligned} |\langle \boldsymbol{\upsilon}\_{j}, \boldsymbol{\Psi} \rangle - c\_{j}| &= |\langle \boldsymbol{\upsilon}\_{j}, \boldsymbol{\Psi} - \sum\_{l \in G} c\_{l} \boldsymbol{\upsilon}\_{l} + \sum\_{l \in G} c\_{l} \boldsymbol{\upsilon}\_{l} \rangle - c\_{j}| \\ &= |\langle \boldsymbol{\upsilon}\_{j}, \boldsymbol{\Psi} - \sum\_{l \in G} c\_{l} \boldsymbol{\upsilon}\_{l} \rangle| \leq ||\boldsymbol{\upsilon}\_{j}|| ||\boldsymbol{\Psi} - \sum\_{l \in G} c\_{l} \boldsymbol{\upsilon}\_{l}|| < \varepsilon, \end{aligned}$$

where we used Definition B.6 as well as υ*i* = 1. Letting ε → 0 yields (B.210). -

Lemma B.60. *Let* {υ*i*}*i*∈*<sup>I</sup> be an orthonormal set in H. We have* Bessel's Inequality

$$\sum\_{i \in I} |\langle \mathfrak{v}\_i, \Psi \rangle|^2 \le \|\Psi\|^2 \text{ (} \Psi \in H \text{)}.\tag{B.212}$$

*Proof.* For any finite *G* ⊂ *I*, a computation based on (A.2) yields

$$\sum\_{i \in G} |\langle \mathfrak{v}\_i, \mathfrak{v} \rangle|^2 = ||\mathfrak{v}||^2 - ||\mathfrak{v} - \sum\_{i \in G} \langle \mathfrak{v}\_i, \mathfrak{v} \rangle \mathfrak{v}\_i||^2 \le ||\mathfrak{v}||^2. \tag{B.213}$$

It follows that also the supremum of the left-hand side over all finite subsets *G* ⊂ *I* is bounded by ψ<sup>2</sup> and hence is finite. By Lemma B.7, this supremum equals <sup>∑</sup>*i*∈*<sup>I</sup>* |υ*i*,ψ|2, which gives (B.212). -

Theorem B.61. *Let B* = {υ*i*}*i*∈*<sup>I</sup> be an orthonormal subset of a Hilbert space H. The following conditions are equivalent (and each defines B to be a* basis *of H):*

*1. Any* ψ ∈ *H can be written (in the sense of Definition B.6) as* ψ = ∑*i*∈*<sup>I</sup> ci*υ*i.*

*2. For each* ψ ∈ *H, one has* Parseval's equality

$$\sum\_{i \in I} |\langle \upsilon\_i, \psi \rangle|^2 = ||\psi||^2. \tag{B.214}$$

*3. For any* ψ,ϕ ∈ *H one has*

$$
\langle \boldsymbol{\uplangle \boldsymbol{\varphi}, \boldsymbol{\upvarphi} \rangle} \rangle = \sum\_{i \in I} \langle \boldsymbol{\upvarphi}, \boldsymbol{\upupsilon} \rangle \langle \boldsymbol{\upupsilon}\_i, \boldsymbol{\uppsi} \rangle. \tag{B.215}
$$


Note that (B.215) is used in almost every computation in quantum physics, in which one also typically has ψ = 1. In that case, (B.214) at least formally turns the |*ci*| <sup>2</sup> <sup>=</sup> |υ*i*,ψ|<sup>2</sup> into (Born) probabilities, as discussed throughout the main text.

*Proof.* Assuming (B.208) and hence (B.210), take ε > 0 and find *F* ⊂ *X* (finite) so that ψ −∑*i*∈*<sup>G</sup> ci*υ*i* < ε. By (B.213), this gives

$$\sum\_{i \in G} |\langle \mathfrak{v}\_i, \Psi \rangle|^2 - ||\Psi||^2| < \varepsilon^2. \tag{B.216}$$

Hence (B.214) holds in the sense of Definition B.6 (with *V* = C). Conversely, assuming (B.214), eq. (B.213) gives (B.208). This proves the equivalence 1 ↔ 2.

Clearly, (B.214) is a special case of (B.215), which in turn follows from (B.208) with (B.210) and continuity of the inner product, whence 3 → 2 and 1 → 3.

Furthermore, 1 → 5 follows by contradiction: given (B.210), any nonzero vector ψ ∈ *B*<sup>⊥</sup> could not possibly be written as (B.208). Conversely, 5 → 1 most easily follows by contradiction, too. For any ψ ∈ *H*, the sum ϕ = ∑*i*∈*I*υ*i*,ψυ*<sup>i</sup>* exists in *H* by Lemma B.59. Continuity of the inner product yields υ*j*,ϕ = υ*j*,ψ and hence υ*j*,ϕ − ψ = 0 for each *j* ∈ *I*, whence ϕ − ψ ∈ *B*⊥. If ϕ cannot be written in the form (B.208) we have ϕ = ψ, so *B*<sup>⊥</sup> = {0}, which is the desired contradiction.

Finally, 4 ↔ 5 is tautological, 5 ↔ 6 is trivial, and 6 ↔ 7 is a special case of Proposition B.57.6 (hence this proposition is needed only for no. 7). -

For example, if *H* = -<sup>2</sup>(*S*), then one may take *I* = *S*, with υ*<sup>x</sup>* = δ*x*. Since *S* is an arbitrary set, this example shows that any cardinality of *I* may, in principle, occur. The existence of a basis has a remarkable consequence, for which we need:

Definition B.62. *Two Hilbert spaces H*<sup>1</sup> *and H*<sup>2</sup> *are called* isomorphic*, written H*<sup>1</sup> ∼= *H*2*, if they are isometrically isomorphic, that is, if there is an invertible linear map u* ∈ *B*(*H*1,*H*2) *such that*

$$\|\|\mu\Psi\|\|\_{H\_2} = \|\|\Psi\|\|\_{H\_1} \ (\Psi \in H\_1). \tag{B.217}$$

By Theorem A.3, a specific surjective isometry *u* : *H*<sup>1</sup> → *H*<sup>2</sup> implementing an isomorphism is automatically *unitary*, in that it is surjective and satisfies

$$
\langle \mu \Psi, \mu \Phi \rangle\_{H\_2} = \langle \Psi, \Phi \rangle\_{H\_1}. \tag{B.218}
$$

Conversely, a unitary map is an isometric isomorphism, so that isometric isomorphism of Hilbert spaces (seen as Banach spaces) is the same as unitary isomorphism. The following theorem (due to von Neumann, who was a specialist in both Hilbert space theory and axiomatic set theory) shows that the classification of Hilbert spaces up to isomorphism reduces to the classification of sets up to bijection.

#### Theorem B.63. *1. Any Hilbert space has a basis.*


Specifically, clause 2 states that if (υ*i*)*i*∈*<sup>I</sup>* and (υ *j* )*j*∈*<sup>J</sup>* are both bases of *H*, then *I* ∼= *J* as sets (i.e., there is a bijection *I* → *J*). Similarly, clause 3 states that *H*<sup>1</sup> ∼= *H*<sup>2</sup> iff *H*<sup>1</sup> has a basis (υ*i*)*i*∈*<sup>I</sup>* and *H*<sup>2</sup> has a basis (υ *j* )*j*∈*<sup>J</sup>* for which *I* ∼= *J*.


$$|I| = \sum\_{i \in I} ||\mathfrak{v}\_i||^2 = \sum\_{i \in I} \sum\_{j \in J} |\langle \mathfrak{v}\_j', \mathfrak{v}\_l \rangle|^2 = \sum\_{i \in I} \sum\_{j \in J} \langle \mathfrak{v}\_j', \mathfrak{v}\_l \rangle \langle \mathfrak{v}\_i, \mathfrak{v}\_j' \rangle = \sum\_{j \in J} ||\mathfrak{v}\_j'||^2 = |J|.$$

A similar computation excludes the possibility that *I* is countable and *J* is not. The general case relies on some cardinal arithmetic, which we spare the reader.

3. Let {υ*i*}*i*∈*<sup>I</sup>* be a basis of *H* and let {υ *j* }*j*∈*<sup>j</sup>* be a basis of *H* . Assume *I* ∼= *J*, so that there is a bijection *b* : *I* → *J*. Define *u* : *H* → *H* and *v* : *H* → *H* by linear extension of *u*υ*<sup>i</sup>* = υ *<sup>b</sup>*(*i*) and *v*υ *<sup>j</sup>* = υ*b*−1(*j*), that is,

$$
\mu \Psi = \sum\_{i \in I} \langle \mathfrak{v}\_i, \Psi \rangle \mathfrak{v}'\_{b(i)} = \sum\_{j \in J} \langle \mathfrak{v}\_{b^{-1}(j)}, \Psi \rangle \mathfrak{v}'\_j; \tag{B.219}
$$

$$\nu \boldsymbol{\upnu}' = \sum\_{j \in J} \langle \boldsymbol{\upnu}'\_j, \boldsymbol{\upnu}' \rangle' \boldsymbol{\upnu}\_{b^{-1}(j)} = \sum\_{i \in I} \langle \boldsymbol{\upnu}'\_{b(i)}, \boldsymbol{\upnu}' \rangle' \boldsymbol{\upnu}\_i,\tag{B.220}$$

where in each line the first equality sign is the definition of the map, whilst the second is a useful rewriting. These maps are well defined by Lemma B.59, e.g.,

$$\sum\_{j \in J} |\langle \mathfrak{v}\_{b^{-1}(j)}, \Psi \rangle|^2 = \sum\_{i \in I} |\langle \mathfrak{v}\_i, \Psi \rangle|^2 = ||\Psi||^2 < \infty,\tag{B.221}$$

so that the sums in (B.219) converges, and likewise for (B.220). Furthermore,

$$
\langle \mu \Psi, \mu \Phi \rangle' = \sum\_{i\_1, i\_2} \langle \Psi, \mathfrak{v}\_{i\_1} \rangle \langle \mathfrak{v}\_{i\_2}, \mathfrak{p} \rangle \langle \mathfrak{v}\_{b(i\_1)}', \mathfrak{v}\_{b(i\_2)}' \rangle' = \sum\_i \langle \Psi, \mathfrak{v}\_i \rangle \langle \mathfrak{v}\_i, \mathfrak{p} \rangle = \langle \Psi, \mathfrak{q} \rangle,
$$

where we used (B.207) for the primed basis, and (B.215). Similar computations establish *v*ψ , *v*ϕ = ψ ,ϕ , so that (in view of their obvious surjectivity) *u* and *v* are both unitary, as well as *uv* = 1*<sup>H</sup>* and *vu* = 1*H*. Thus *H* ∼= *H* .

Conversely, if *H* (with basis {υ*i*}*i*∈*I*) and *H* are isomorphic, so that there is a unitary *u* : *H* → *H* , then {*u*υ*i*}*i*∈*<sup>I</sup>* is a basis of *H* , hence *J* even equals *I*. -

Corollary B.64. *If* {υ*i*}*i*∈*<sup>I</sup> is a basis of H, then H* ∼= -<sup>2</sup>(*I*)*.*

*Proof.* Define *u* : *H* → -<sup>2</sup>(*I*) by linear extension of *<sup>u</sup>*υ*<sup>i</sup>* <sup>=</sup> <sup>δ</sup>*i*, where *<sup>i</sup>* <sup>∈</sup> *<sup>I</sup>*. -

Corollary B.65. *A Hilbert space is (topologically) separable iff it either has a countable basis, or is finite-dimensional.*

*Proof.* One direction of the proof is the Gram–Schmidt procedure (since the given countable dense set contains a basis). Conversely, if {υ*i*} is a countable (or finite) basis of *H*, then the complex rational linear span of this set, i.e., the set of all finite linear combinations ∑*<sup>i</sup> ci*υ*<sup>i</sup>* with *ci* ∈ Q+*i*Q, is countable as well as dense in *H*. -

In particular, any finite-dimensional Hilbert space is isomorphic to C*<sup>n</sup>* with standard inner product, and any separable Hilbert space is isomorphic to -<sup>2</sup>(N); when speaking of a separable Hilbert spaces we actually tend to think of the infinite-dimensional case. Although at first sight separability appears to be a rather restrictive condition, in fact the non-separable case only appears in some weird proofs in the theory of operator algebras (as well as in the theory of almost continuous functions in the sense of H. Bohr). Indeed, every Hilbert space naturally occurring in applications to mathematical physics (or to partial differential equations) is separable.

#### B.13 Operators on infinite-dimensional Hilbert space

The fact that all (infinite-dimensional) separable Hilbert spaces are isomorphic suggests that the riches of the theory are not be found in the spaces themselves, but in the operators that act on them (whose explicit form typically depends on some concrete realization of *H*, like -<sup>2</sup>(N), or *L*2(R*d*), etc.). The simplest operators are functionals, i.e., linear maps *f* : *H* → C, and the main new feature compared to the finite-dimensional case is that *f* is no longer *necessarily* bounded, see §B.9. The nature of bounded linear functionals, i.e., elements of the dual *H*∗, is totally settled by the *Riesz–Frechet Theorem ´* (which we already know; cf. Proposition A.5 and nos. 6 and 7 in Table B.1 in §B.9), showing that little is gained by looking at them.

Theorem B.66. *Let H be a Hilbert space. The map* ψ → *f*<sup>ψ</sup> *from H to H*∗*, where*

$$f\_{\Psi}(\boldsymbol{\upvarphi}) = \langle \boldsymbol{\upvarphi}, \boldsymbol{\upvarphi} \rangle,\tag{B.222}$$

*is an isometric anti-linear isomorphism H* → *H*∗*.*

*Proof.* For convenience we rewrite (B.124) for the case at hand as

$$\|f\| = \sup\{ |f(\Psi)|, \Psi \in H, \|\|\Psi\|\|\_{H} \le 1 \}. \tag{B.223}$$

Since | *f*ψ(ϕ)| = |ψ,ϕ| ≤ ψϕ by Cauchy–Schwarz, it follows that *f*<sup>ψ</sup> ∈ *H*<sup>∗</sup> for any ψ ∈ *H*, with *f*ψ≤ψ. We may sharpen this to equality, i.e.,

$$\|f\_{\Psi}\| = \|\Psi\|,\tag{\text{B.224}}$$

by choosing *f* = *f*<sup>ψ</sup> and ϕ = ψ in (B.223). Hence ψ → *f*<sup>ψ</sup> is isometric and therefore also injective. To prove surjectivity, we find a vector ψ for which some given *nonzero* functional *f* equals *f*<sup>ψ</sup> (of *f* = 0, then ψ = 0 does the job). Assume *f* = 0 (otherwise, ψ = 0 does the job). Then ker(*f*)<sup>⊥</sup> = {0}: namely, ker(*f*) is closed by continuity of *f* and is linear by linearity of *f* , whence ker(*f*)⊥⊥ = ker(*f*) by Proposition B.57.3, so that (arguing by contradiction) ker(*f*)<sup>⊥</sup> = {0} would imply ker(*f*)⊥⊥ = *H* and hence ker(*f*) = *H*, or *f* = 0.

The remainder of the proof is the same as for Proposition A.5. -

This allows one to make the weak topology on *H* (or, equivalently, the weak∗ topology on *H*∗) explicit (cf. §B.9): we have ψ*<sup>n</sup>* → ψ weakly iff ϕ,ψ*<sup>n</sup>* − ψ → 0 for each ϕ ∈ *H* (and similarly for nets). From the general theory, or directly from Cauchy–Schwarz, it is immediate that (at least for infinite-dimensional *H*) the weak topology on *H* is indeed weaker than the strong one (that is, strong convergence implies weak convergence), but not the other way round. A simple example is provided by any ordered countable basis (υ*n*)*n*∈<sup>N</sup> of a separable Hilbert space, where υ*<sup>n</sup>* → 0 weakly but not strongly for any *n* ∈ N (more generally, for any infinite-dimensional Hilbert space and any basis {υ*i*} we have υ*<sup>i</sup>* → 0 weakly but not strongly in the sense of convergence of nets). Nonetheless, as a corollary of Proposition B.46:

Corollary B.67. *The functional f*<sup>ψ</sup> *defined by* (B.222) *is weakly continuous.*

We now move from functionals als special operators from *H* to C to operators in the usual sense, i.e., linear maps from *H* to itself. Once again, the main new feature compared to the finite-dimensional case is that a linear map *a* : *H* → *H* is no longer necessarily bounded, where (cf. Definition B.32) we recall that *a* is *bounded* if it satisfies one (and hence both) of the following equivalent conditions:

$$\|a\Psi\| \le C\|\Psi\|\ \ (\Psi \in H);\tag{B.225}$$

$$\sup\{\|a\Psi\|,\Psi\in H,\|\|\Psi\|\le 1\}<\text{\textquotedbl{}}\tag{\text{B.226}}$$

In that case, the (finite) supremum is called the *norm a* of *a*, exactly as in (A.18). Using Theorem B.66 and (B.130), we therefore have

$$\|a\| = \sup\{\|a\Psi\|, \Psi \in H, \|\|\Psi\| = 1\}\tag{\text{B.227}}$$

$$\mathcal{I} = \sup\{ |\langle \boldsymbol{\upphi}, a\boldsymbol{\uppsi} \rangle|, \boldsymbol{\uppsi}, \boldsymbol{\upphi} \in H, \|\boldsymbol{\uppsi}\| = \|\boldsymbol{\upphi}\| = 1 \},\tag{B.228}$$

and we have the inequalities (A.20) and (A.21), as in the finite-dimensional case.

It is clear from (A.20) and (B.225) that bounded operators *a* are continuous, in that if ψ*<sup>n</sup>* → ψ, then *a*ψ*<sup>n</sup>* → *a*ψ. On the other hand, *unbounded operators* are discontinuous in this sense: for each *n* ∈ *N* there is ψ*<sup>n</sup>* ∈ *H* with ψ*n* = 1 and *a*ψ*n* ≥ *n*. The sequence (ψ˜*<sup>n</sup>* = ψ*n*/*n*) then converges to zero, but since *a*ψ˜*n* ≥ 1, the sequence (*a*ψ˜*n*) does not converge to *a* · 0 = 0. Thus on infinite-dimensional Hilbert spaces a sharp distinction emerges between *bounded* and *unbounded* operators.

Among the former, we will distinguish between *compact* operators and the rest, whilst among the latter, one has the *closed* operators (i.e., those with a closed graph), which are still reasonably well-behaved, and the (non-closed) rest. Yet cutting through the bounded-unbounded divide is the notion of *self-adjointness*. For any linear (not necessarily bounded) map *a* : *H* → *H*, we say that *a* is *self-adjoint* if

$$
\langle a\mathfrak{\boldsymbol{\up}},\mathfrak{\boldsymbol{\up}}\rangle = \langle \mathfrak{\boldsymbol{\up}},a\mathfrak{\boldsymbol{\up}}\rangle,\ (\mathfrak{\boldsymbol{\up}},\mathfrak{\boldsymbol{\up}}\in H).\tag{\text{B.229}}
$$

The remarkable *Hellinger–Toeplitz Theorem* then states that such maps are bounded:

Theorem B.68. *If a linear map a* : *H* → *H satisfies* (B.229)*, then it is bounded.*

*Proof.* The proof is based on the Closed Graph Theorem B.37. If the sequence (ψ*n*,*a*ψ*n*) in *G*(*a*) ⊂ *H* ⊕ *H* converges, say to (ψ,ϕ) ∈ *H* ⊕ *H*, then ψ*<sup>n</sup>* → ψ and *a*ψ*<sup>n</sup>* → ϕ. Using (B.229) and continuity of the inner product, for χ ∈ *H* we have

$$\langle \chi, \Phi \rangle = \lim\_{n} \langle \chi, a\Psi\_{n} \rangle = \lim\_{n} \langle a\chi, \Psi\_{n} \rangle = \langle a\chi, \Psi \rangle = \langle \chi, a\Psi \rangle.$$

For χ = ϕ − *a*ψ, this yields ϕ = *a*ψ, and hence (ψ,ϕ) ∈ *G*(*a*). This means that *G*(*a*) is closed, upon which the Closed Graph Theorem states that *a* is bounded. -

More generally, if *V* and *W* are Banach spaces, with dual spaces *V*∗ and W\*, respectively, and two linear (but not *a priori* bounded) maps *a* : *V* → *W* and *b* : *W*<sup>∗</sup> → *V*<sup>∗</sup> satisfy ϕ(*av*)=(*b*ϕ)(*v*) for each *v* ∈ *W* and ϕ ∈ *W*∗, then *a* and *b* are bounded, with *b* = *a*∗, as defined in (B.125). The proof is similar.

This generalization of Theorem B.68 also places the familiar adjoint *a*∗ from Hilbert space in broader perspective: making the identification *f*<sup>ψ</sup> ↔ ψ of *H*<sup>∗</sup> with *H* described by the Riesz–Frechet Theorem B.66, the Banach space definition (B.125) ´ of the adjoint *a*<sup>∗</sup> : *H*<sup>∗</sup> → *H*<sup>∗</sup> of a bounded linear map *a* : *H* → *H* reproduces the definition (A.15) of the Hilbert space adjoint *a*<sup>∗</sup> : *H* → *H*. Thus we also infer that (B.128) is valid for arbitrary Hilbert spaces. Note that in the Hilbert space case, boundedness of *a*∗ may be proved more simply, as follows.

Proposition B.69. *Let a* ∈ *B*(*H*) *and let a*<sup>∗</sup> : *H* → *H be its adjoint, that is,*

$$
\langle a^\* \Psi, \Phi \rangle = \langle \Psi, a\Phi \rangle \text{ (}\Psi, \Phi \in H \text{)}.\tag{\text{B.230}}
$$

*Then a*<sup>∗</sup> *is bounded, with a*∗ = *a.*

*Proof.* Eq. (B.230) gives|*a*∗ψ,ϕ| ≤ *a*ψϕ. Taking ϕ = *a*∗ψ yields *a*∗ψ ≤ *a*|ψ|, and hence *a*∗≤*a*. Replacing *a* by *a*<sup>∗</sup> gives the last claim. -.

Since unbounded self-adjoint operators *a* : *H* → *H* do not exist, von Neumann defined such operators on some (proper) linear subspace *D*(*a*) ⊆ *H* (*always assumed to be dense in H*), called the *domain* of *a*. This affects the definition of the adjoint:

Definition B.70. *1. The adjoint a*<sup>∗</sup> *of an operator a* : *D*(*a*) → *H has domain D*(*a*∗) ⊂ *H consisting of all* <sup>ψ</sup> <sup>∈</sup> *H for which the functional f <sup>a</sup>* <sup>ψ</sup> : *D*(*a*) → C*, defined by*

$$f^a\_\Psi(\mathfrak{q}) = \langle \Psi, a\mathfrak{q} \rangle \ (\mathfrak{q} \in D(a)), \tag{\text{B.231}}$$

*is bounded, i.e., there is C* <sup>&</sup>gt; <sup>0</sup> *such that* <sup>|</sup> *<sup>f</sup> <sup>a</sup>* <sup>ψ</sup>(ϕ)| ≤ *C*ϕ *for all* ϕ ∈ *D*(*a*)*.*


$$
\langle a^\* \Psi, \Phi \rangle = \langle \Psi, a\Phi \rangle, \ \Psi \in D(a^\*), \Phi \in D(a). \tag{B.232}
$$

*Note that, on our assumption that D*(*a*) *be dense in H, i.e., D*(*a*)− = *H, eq.* (B.232) *indeed uniquely specifies a*∗ψ *because of Proposition B.57.4.*

*4. An operator a* : *D*(*a*) → *H is called* self-adjoint *when D*(*a*∗) = *D*(*a*) *and a*<sup>∗</sup> = *a.*

If *<sup>D</sup>*(*a*) = *<sup>H</sup>*, and *<sup>a</sup>* is bounded, then also *<sup>D</sup>*(*a*∗) = *<sup>H</sup>*, since <sup>|</sup> *<sup>f</sup> <sup>a</sup>* <sup>ψ</sup>(ϕ)|≤*a*ψϕ, so that *f <sup>a</sup>* <sup>ψ</sup> is bounded for any ψ ∈ *H*. Accordingly, for *a* ∈ *B*(*H*), Definition B.70 reduces to the usual definition (A.15). Furthermore, even if *D*(*a*) is merely dense in *H*, if *a* : *D*(*a*) → *H* is bounded in the sense of (B.225) - (B.226), but now with ψ ∈ *D*(*a*) instead of ψ ∈ *H*, then *a* has a unique extension to a a bounded operator *a* : *H* → *H*, whose adjoint *a*<sup>∗</sup> may be either defined through Definition B.70 as the adjoint of *a* : *D*(*a*) → *H*, or, equivalently, as the adjoint of the extension *a* : *H* → *H*.

Here, as well as in Definition B.70.2, a general Banach space principle is at work:

Proposition B.71. *Let V and W be Banach spaces, and let V be a dense subset of V . Any bounded linear map a* : *V* → *W (in the sense of Definition B.32) has a unique* bounded *linear extension a* : *V* → *W, with a* = *a .*

*Proof.* For *v* ∈ *V* there is a sequence (*vn*) in *V* with *vn* → *v*. Since *a* : *V* → *W* is bounded and (*vn*) is convergent in *V* and hence Cauchy in *V*, also the sequence (*a vn*) in *W* is Cauchy. Since *W* is assumed complete, we may define *av* = lim*<sup>n</sup> a vn*. This limit is easily seen to be independent of the approximating sequence to *v*, and the ensuing map *a* : *V* → *W* is clearly linear. Furthermore, since by (B.5) we have *v* = lim*<sup>n</sup> vn*, if we assume *v* = 1 we can take *vn* to have unit norm also.

Once again from (B.5), we also have *av* = lim*<sup>n</sup> a vn* ≤ sup*<sup>n</sup> a vn*, whence *a*≤*a* . But for *v* ∈ *V* , taking *vn* = *v* we have *a v* = *av*, and hence the bound *a v*≤*av*, from which *a* ≤*a*, so that finally *a* = *a* . -

To complete these basic definitions, we say that an (unbounded) operator *a* : *D*(*a*) → *H* is *closed* if its graph *G*(*a*) = {(ψ,*a*ψ),ψ ∈ *D*(*a*)} is a closed subspace of *H* ⊕*H*, cf. (B.108). Note that in the Hilbert space case it is more appropriate to replace the norm (B.107) on *H* ⊕*H* by the equivalent norm

$$\|(\nu, w)\| = \sqrt{\|\nu\|^2 + \|\nu\|^2},\tag{B.233}$$

since this alternative norm comes from the canonical inner product on *H* ⊕*H*, viz.

$$
\langle (\mathbf{v}, \mathbf{w}), (\mathbf{v}', \mathbf{w}') \rangle\_{H \oplus H} = \langle \mathbf{v}, \mathbf{v}' \rangle\_H + \langle \mathbf{w}, \mathbf{w}' \rangle\_H. \tag{\mathbf{B.234}}
$$

We now prove an important property of self-adjoint operators:

Proposition B.72. *The adjoint a*<sup>∗</sup> *of any operator a* : *D*(*a*) → *H is closed. In particular, self-adjoint operators are closed.*

*Proof.* The proof can be elegantly given in terms of the graph *G*(*a*). Defining

$$
u: H \oplus H \to H \oplus H;\tag{\mathbb{B}.235}$$

$$
\mu(\Psi\_1, \Psi\_2) = (-\Psi\_2, \Psi\_1), \tag{\text{B.236}}
$$

it is easy to verify that *u* is a unitary operator, and that

$$G(a^\*) = \mathfrak{u}(G(a)^\perp) = (\mathfrak{u}G(a))^\perp. \tag{B.237}$$

Hence *G*(*a*∗) is closed by Proposition B.57.1, and the claim follows. -

In the the context of spectral theory, we will see later what the real importance of self-adjointness (and, more generally, closedness) is. It is time for some examples.

Proposition B.73. *Let H* = -<sup>2</sup>(*X*)*, with X countable for simplicity, and for f* <sup>∈</sup> -<sup>∞</sup>(*X*) *define the* multiplication operator *mf* : *<sup>H</sup>* <sup>→</sup> *H by*

$$m\_f \Psi = f \Psi,\tag{\text{B.238}}$$

*i.e., mf*ψ(*x*) = *f*(*x*)ψ(*x*)*. Then mf is bounded, with norm, cf.* (A.107)*,*

$$\|\boldsymbol{m}\_f\| = \|\boldsymbol{f}\|\_\ast. \tag{\mathbf{B}.239}$$

*More generally, let H* = *L*2(*X*) *for some* σ-finite *Borel space* (*X*,Σ,μ)*, and for <sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*∞(*X*)*, define mf in the same way. Then mf is again bounded, with norm*

$$\|\|m\_f\|\| = \|f\|\|\_\infty^\infty. \tag{\mathbb{B}.240}$$

*Finally, let f* : *X* → R *be measurable (but not necessarily essentially bounded). Then*

$$D(m\_f) = \{ \Psi \in L^2(X) \mid f\Psi \in L^2(X) \}. \tag{B.241}$$

*is dense in L*2(*X*)*, and if f* <sup>∗</sup> <sup>=</sup> *f , the operator mf* : *<sup>D</sup>*(*mf*) <sup>→</sup> *<sup>L</sup>*2(*X*) *is self-adjoint.*

*Proof.* On -<sup>2</sup>(*X*) we have *<sup>f</sup>*ψ<sup>2</sup> ≤ *<sup>f</sup>* ∞ψ2, and hence *mf* ≤ *<sup>f</sup>* ∞. Assume *f* = 0. Then *f* <sup>∞</sup> > 0, and for any 0 < *t* < *f* <sup>∞</sup> there is *xt* ∈ *X* such that | *f*(*xt*)| ≥ *t*, so that ψ*<sup>t</sup>* = 1{*xt*} ∈ -<sup>2</sup>(*X*) satisfies *mf*ψ*t*<sup>2</sup> <sup>=</sup> <sup>|</sup> *<sup>f</sup>*(*xt*)| ≥ *<sup>t</sup>*, whence *mf* ≥ *<sup>t</sup>*. This holds for all 0 < *t* < *f* ∞, hence *mf* ≥ *f* ∞, which yields (B.239).

To prove (B.240), again assume *<sup>f</sup>* ess <sup>∞</sup> <sup>&</sup>gt; 0 and 0 <sup>&</sup>lt; *<sup>t</sup>* <sup>&</sup>lt; *<sup>f</sup>* ess <sup>∞</sup> . Then the set *Xt* = {*x* ∈ *X*,| *f*(*x*)| ≥ *t*} is measurable, with μ(*Xt*) > 0. Since (*X*,Σ,μ) is σ-finite, there is *X <sup>t</sup>* ⊂ *Xt* with 0 < μ(*X <sup>t</sup>*) < ∞. Take ψ = 1*<sup>X</sup> t* , so that *f*ψ<sup>2</sup> ≥ *t*ψ2, etc.

To prove the density of *<sup>D</sup>*(*mf*), for *<sup>n</sup>* <sup>∈</sup> <sup>N</sup> define *<sup>X</sup>*˜*<sup>n</sup>* <sup>=</sup> {*<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>* | | *<sup>f</sup>*(*x*)| ≤ *<sup>n</sup>*}, so that *<sup>X</sup>* <sup>=</sup> <sup>∪</sup>*nX*˜*n*. For each <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(*X*) we then have 1*X*˜*n*<sup>ψ</sup> <sup>∈</sup> *<sup>D</sup>*(*mf*). Writing <sup>ϕ</sup>*<sup>n</sup>* <sup>=</sup> <sup>1</sup>*X*˜*n*ψ, we have ψ,ϕ*n* <sup>=</sup> *<sup>X</sup>*˜*<sup>n</sup> d*μ |ψ| 2, hence ψ,ϕ*n* <sup>=</sup> 0 iff <sup>ψ</sup> <sup>=</sup> <sup>0</sup> <sup>μ</sup>-a.e. on *<sup>X</sup>*˜*n*. This is true for each *n* ∈ N iff ψ = 0, so the required density follows from Proposition B.57.4

In the last claim (where *f* ∗(*x*) = *f*(*x*)), the domain *D*(*m*∗ *<sup>f</sup>*) consists of all ψ ∈ *<sup>L</sup>*2(*X*) for which the map <sup>ϕ</sup> → *<sup>X</sup> d*μ ψ *f*ϕ is bounded; by Theorem B.66 this is the case iff *<sup>f</sup>*<sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(*X*), so that *<sup>D</sup>*(*m*<sup>∗</sup> *<sup>f</sup>*) = *D*(*mf*). Moreover, (B.232) obviously holds for *a*<sup>∗</sup> = *mf* (if *f* takes complex values, then *m*<sup>∗</sup> *<sup>f</sup>* = *mf* <sup>∗</sup> , still on *D*(*m*<sup>∗</sup> *<sup>f</sup>*) = *D*(*mf*)). -

For quantum mechanics, a key example is *H* = *L*2(R) with *f*(*x*) = *x*, i.e., the position operator. It then follows from Proposition B.73 that *x* is self-adjoint on the domain

$$D(m\_{\mathbf{x}}) = \{ \Psi \in L^2(\mathbb{R}) \mid \int\_{\mathbb{R}} d\mathbf{x} \mathbf{x}^2 |\Psi(\mathbf{x})|^2 < \infty \}. \tag{\mathbf{B}.242}$$

See also §5.11. It happens often that a given operator on some domain is not closed as it stands, but can be made so by slightly enlarging its domain. Thus an operator *a* : *D*(*a*) → *H* is *closable* if the closure of the graph *G*(*a*) in *H* ⊕*H* is the graph of a closed operator *a*−, called the *closure* of *a*, i.e., *G*(*a*)− = *G*(*a*−). The following easy lemma is very useful in proving closability (the proof is a definition chase).

Lemma B.74. *Each of the following conditions is equivalent to closability of a:*


*The domain D*(*a*−) *of the closure a*<sup>−</sup> *of a closable operator a consists of all* ψ ∈ *H for which there exists a sequence* (ψ*n*) *in D*(*a*) *such that* ψ*<sup>n</sup>* → ψ and *a*ψ*<sup>n</sup> converges, so that a*−ψ = lim*<sup>n</sup> a*ψ*n. Finally, if a is closable, then a*<sup>−</sup> = *a*∗∗ *and* (*a*−)<sup>∗</sup> = *a*∗*.*

An equality *a* = *b* between unbounded operators always stands for *D*(*a*) = *D*(*b*) and *a* = *b*. Furthermore, *a* ⊂ *b* means *D*(*a*) ⊆ *D*(*b*) and *b* = *a* on *D*(*a*).

Definition B.75. *Let a* : *D*(*a*) → *H (where D*(*a*) *is dense) be an operator.*


It follows from Lemma B.74 that a symmetric operator is closable (because *D*(*a*∗), containing *D*(*a*), is dense). For a symmetric operator one has *a* ⊆ *a*<sup>−</sup> = *a*∗∗ ⊆ *a*∗, with equality at the first position when *a* is closed, and equality at the second position when *a* is essentially self-adjoint; when both equalities hold, *a* is self-adjoint. Conversely, an essentially self-adjoint operator is symmetric. A symmetric operator may or may not be essentially self-adjoint; we will not discuss this problem here.

As in the finite-dimensional case, the notion of the adjoint allows one to define a *projection* as an operator *<sup>e</sup>* : *<sup>H</sup>* <sup>→</sup> *<sup>H</sup>* that satisfies *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*. However, Proposition A.8 should be slightly adapted in order to cover the infinite-dimensional case:

Proposition B.76. *There is a bijective correspondence e* ↔ *L between:*


*still given by* (A.27) *-* (A.28)*, where now* {υ*i*}*i*∈*<sup>I</sup> is a basis of L, and the latter sum must be applied to fixed* ψ ∈ *H according to Definition B.6 with V* = *H, i.e.,*

$$e\,\Psi = \sum\_{i \in I} \langle \upsilon\_i, \Psi \rangle \upsilon\_i, \ \Psi \in H. \tag{B.243}$$

Alternatively, without invoking the concept of a basis, one may use the decomposition (B.203) as proved via Lemma B.58, to define *e* directly by *e*ψ = *e*ψ.

*Proof.* The linear subspace *L* = *eH* is closed, since *e* is bounded by Theorem B.68.

Conversely, note that since *L* is closed, it is a Hilbert space, so that it has a basis by Theorem B.63. The sum in (B.243) then converges by Lemma B.59, and since

$$
\begin{split}
\langle\!\langle\mathfrak{p},e\Psi\rangle\!\rangle &= \sum\_{i\in I} \langle\mathfrak{v}\_{i},\mathfrak{v}\rangle \langle\!\langle\mathfrak{p},\mathfrak{v}\_{i}\rangle = \sum\_{i\in I} \overline{\langle\mathfrak{v}\_{i},\mathfrak{p}\rangle \langle\!\langle\mathfrak{p},\mathfrak{v}\_{i}\rangle} = \overline{\langle\mathfrak{p},e\mathfrak{p}\rangle} = \langle e\mathfrak{p},\mathfrak{v}\rangle; \\
{e^{2}\Psi} &= \sum\_{i\in I} \langle\mathfrak{v}\_{i},\mathfrak{v}\rangle e\mathfrak{v}\_{i} = \sum\_{i,j\in I} \langle\mathfrak{v}\_{i},\mathfrak{v}\rangle \langle\mathfrak{v}\_{j}\mathfrak{v}\_{i}\rangle \mathfrak{v}\_{j} = \sum\_{i\in I} \langle\mathfrak{v}\_{i},\mathfrak{v}\rangle \mathfrak{v}\_{i} = e\Psi, \\
\end{split}
$$

the operator *e* is a projection (in the second computation we used boundedness of *e* to pull it through the sum). Next, (B.243) is independent of the choice of a basis of *L*, since if {υ*<sup>i</sup>* }*i* <sup>∈</sup>*<sup>I</sup>* is another basis of *L*, for arbitrary ϕ ∈ *L* we may compute:

574 B Basic functional analysis

$$
\begin{split}
\langle \boldsymbol{\uplangle \boldsymbol{\varphi}, \sum\_{i \in I} \langle \boldsymbol{\upupsilon}\_{i}, \boldsymbol{\upmu} \rangle \, \boldsymbol{\upupsilon}\_{i} \rangle - \sum\_{i' \in I'} \langle \boldsymbol{\upupsilon}\_{i'}, \boldsymbol{\upmu} \rangle \, \boldsymbol{\upupsilon}\_{i'} \rangle &= \sum\_{i \in I} \langle \boldsymbol{\uplangle \upmu}, \boldsymbol{\upupsilon}\_{i} \rangle \langle \boldsymbol{\upupsilon}\_{i}, \boldsymbol{\upmu} \rangle - \sum\_{i' \in I} \langle \boldsymbol{\upmu}, \boldsymbol{\upupsilon}\_{i'} \rangle \langle \boldsymbol{\upupsilon}\_{i'}, \boldsymbol{\upmu} \rangle \\ &= \langle \boldsymbol{\upmu}, \boldsymbol{\upmu} \rangle - \langle \boldsymbol{\upmu}, \boldsymbol{\upmu} \rangle = 0,\end{split} \tag{B.244}
$$

where we twice used (B.215), applied to the Hilbert space *L*. Hence

$$\sum\_{i \in I} \langle \boldsymbol{\upvarphi}, \boldsymbol{\upnu}\_{i} \rangle \langle \boldsymbol{\upnu}\_{i}, \boldsymbol{\upnu} \rangle = \sum\_{i' \in I} \langle \boldsymbol{\upvarphi}, \boldsymbol{\upnu}\_{i'} \rangle \langle \boldsymbol{\upnu}\_{i'}, \boldsymbol{\upvarphi} \rangle. \tag{B.245}$$

Finally, we prove bijectivity of the correspondence *L* ↔ *e*:


$$
\begin{split}
\langle \boldsymbol{\varrho}^{\parallel}, \boldsymbol{e} \boldsymbol{\varrho} \boldsymbol{\Psi} \rangle - \sum\_{i} \langle \boldsymbol{\varrho}^{\parallel}, \boldsymbol{\upsilon}\_{i} \rangle \langle \boldsymbol{\upsilon}\_{i}, \boldsymbol{\Psi} \rangle &= \langle \boldsymbol{\varrho}^{\parallel}, \boldsymbol{\Psi} \rangle - \langle \boldsymbol{\varrho}^{\parallel}, \boldsymbol{\Psi} \rangle = 0; \\
\langle \boldsymbol{\varrho}^{\perp}, \boldsymbol{e} \boldsymbol{\varrho} \boldsymbol{\rangle} - \sum\_{i} \langle \boldsymbol{\varrho}^{\perp}, \boldsymbol{\upsilon}\_{i} \rangle \langle \boldsymbol{\upsilon}\_{i}, \boldsymbol{\Psi} \rangle &= \langle \boldsymbol{\varrho}, (1 - \boldsymbol{e}) \boldsymbol{e} \boldsymbol{\Psi} \rangle - \sum\_{i} \langle \boldsymbol{\varrho}, (1 - \boldsymbol{e}) \boldsymbol{\upsilon}\_{i} \rangle \langle \boldsymbol{\upsilon}\_{i}, \boldsymbol{\Psi} \rangle = 0, \end{split}
$$

where is the first line we used (B.215), applied to the Hilbert space *H*. -

It is easy to see why the sum (B.243) cannot, in general, converge in norm without the ψ, i.e., in the original (finite-dimensional) form (A.28); it suffices to take *e* = 1 (for *H* = -<sup>2</sup>(N), for simplicity). Writing *en* = ∑*<sup>n</sup> <sup>i</sup>*=<sup>1</sup> |υ*i*υ*i*|, where, for example, υ*<sup>i</sup>* = δ*i*, for any unit vector ψ and *m* > *n*, from (A.18) we have

$$\left\|e\_{m} - e\_{n}\right\|^{2} \ge \left\|(e\_{m} - e\_{n})\Psi\right\|^{2} = \sum\_{i=n+1}^{m} \left|\left\langle \mathfrak{v}\_{i}, \Psi \right\rangle\right|^{2}. \tag{B.246}$$

Taking <sup>ψ</sup> <sup>=</sup> <sup>υ</sup>*<sup>j</sup>* for any *<sup>n</sup>* <sup>+</sup> <sup>1</sup> <sup>≤</sup> *<sup>j</sup>* <sup>≤</sup> *<sup>m</sup>* shows that that *em* <sup>−</sup> *en*<sup>2</sup> <sup>≥</sup> 1 for all *<sup>m</sup>*,*n*, so that (*en*) cannot be a Cauchy sequence in *B*(*H*). This argument applies to any infinite-dimensional subspace *L*. Therefore, if *H* is infinite-dimensional we should work with at least two notions of convergence within the Banach space *B*(*H*) (cf. Theorem B.33), which for simplicity we state for sequences (more generally, one should *define* the corresponding topologies in terms of convergence of nets):


The strong topology on *B*(*H*) is also called the *strong operator topology*, in order to distinguish it from the strong topology on *H* itself (which, confusingly, is another name for the norm topology) in terms of which it is defined. Similarly, the weak topology on *H* (cf. §B.12) defines a *weak operator topology* on *B*(*H*), as follows:

• *an* → *a weakly* on *B*(*H*) iff ϕ,(*an* −*a*)ψ → 0, for each ϕ,ψ ∈ *H*.

In decreasing strength we have 'norm - strong - weak', and we show that this trio is distinguishable on *H* = -<sup>2</sup>(N)(and hence on any infinite-dimensional Hilbert space):

• Let *an*ψ(*x*) = 0 for *x* = 1,...,*n* whilst *an*ψ(*x*) = ψ(*x*) for *x* > *n*. In other words, if ψ = (ψ1,ψ2,...), then *an*ψ = (0,...,0,ψ*n*+1,ψ*n*+2,...) with *n* zeros. Hence

$$||a\_n\Psi||^2 = \sum\_{\mathbf{x}=n+1}^{\infty} |\Psi(\mathbf{x})|^2,$$

so that *an*ψ → 0 as *n* → ∞ in order for ψ to be in -<sup>2</sup>(N). Thus *an* <sup>→</sup> 0 strongly (and hence also weakly). If (*an*) were to have a norm limit, it therefore would have to be zero, too, but since *an*≥*an*ψ for any unit vector ψ, taking e.g. ψ = δ*n*+1, we have *an* ≥ 1 for any *n* and hence (*an*) cannot converge in norm.

• A slight variation on this example is *an*ψ(*x*) = 0 for *x* = 1,...,*n* (once again), but now *an*ψ(*x*) = ψ(*x* − *n*) for *x* > *n*, or, equivalently, *an*ψ = (0,...,0,ψ1,ψ2,...) with *n* zeros. This time, we have *an*ψ = ψ, so to begin with, *an* → 0 strongly is excluded. However, ϕ,*an*ψ <sup>=</sup> <sup>∑</sup><sup>∞</sup> *<sup>x</sup>*=<sup>1</sup> ϕ(*x*+*n*)ψ(*x*), so lim*n*→∞ϕ,*an*ψ = 0: to see this, take *N* < ∞ fixed and use Cauchy–Schwarz to estimate

$$\begin{split} |\langle \boldsymbol{\Phi}, \boldsymbol{a}\_{n} \boldsymbol{\Psi} \rangle| &\leq \left| \sum\_{\boldsymbol{x}=1}^{N} \overline{\boldsymbol{\Phi}(\boldsymbol{x}+\boldsymbol{n})} \boldsymbol{\Psi}(\boldsymbol{x}) + \sum\_{\boldsymbol{x}=N+1}^{\infty} \overline{\boldsymbol{\Phi}(\boldsymbol{x}+\boldsymbol{n})} \boldsymbol{\Psi}(\boldsymbol{x}) \right| \\ &\leq ||\boldsymbol{\Psi}|| \left( \sum\_{\boldsymbol{x}=\boldsymbol{n}+1}^{\infty} |\boldsymbol{\Phi}(\boldsymbol{x})|^{2} \right)^{1/2} + ||\boldsymbol{\Phi}|| \left( \sum\_{\boldsymbol{x}=N+1}^{\infty} |\boldsymbol{\Psi}(\boldsymbol{x})|^{2} \right)^{1/2} . \end{split} \tag{B.247}$$

Letting *N* → ∞ and then *n* → ∞ yields ϕ,*an*ψ → 0, so that *an* → 0 weakly. But (*an*) has no strong limit (for if it existed, it would have to be zero, too).

It is clear from Theorem B.33 that *B*(*H*) is *sequentially complete* in its norm topology. This is true also in the weak and strong operator topologies:

#### Proposition B.77. *Let* (*an*) *be a sequence in B*(*H*)*.*


It is instructive to prove this, using two results of independent interest.

Theorem B.78. *Suppose V is a Banach space, W is a normed space (not necessarily complete), X is an arbitrary set, and* {*ax*}*x*∈*<sup>X</sup> is some family of operators in B*(*V*,*W*) *indexed by X. If the family is* pointwise *bounded in that*

$$\sup\{\|a\_{\mathcal{X}}\nu\|, \mathbf{x} \in X\} < \ast \ (\nu \in V),\tag{\mathbf{B}.248}$$

*then the family is* uniformly *bounded in that*

$$\sup\{||a\_{\boldsymbol{x}}||, \boldsymbol{x} \in X\} < \infty. \tag{B.249}$$

#### This is the *Principle of Uniform Boundedness* or *Banach–Steinhaus Theorem*.

*Proof.* If *W* is not complete, use its completion in what follows. Define -<sup>∞</sup>(*X*,*W*) to be the set of all bounded functions *f* : *X* → *W*, i.e., those function such that sup{ *f*(*x*), *x* ∈ *X*} < ∞, with pointwise operations. This is easily checked to be a Banach space itself in the natural norm *f* <sup>∞</sup> = sup{ *f*(*x*), *x* ∈ *X*} (using the auxiliary functions ˜*<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>C</sup> defined by each *<sup>f</sup>* <sup>∈</sup> -<sup>∞</sup>(*X*,*W*) as ˜*f*(*x*) = *<sup>f</sup>*(*x*), so that *<sup>f</sup>* <sup>∞</sup> <sup>=</sup> ˜*<sup>f</sup>* ∞, one may largely reduce the proof to the ordinary -<sup>∞</sup>(*X*) case).

For fixed *v* ∈ *V*, define *fv* : *X* → *W* by *fv*(*x*) = *ax*(*v*). By assumption, *f* ∈ -<sup>∞</sup>(*X*,*W*), so we may define an operator *<sup>F</sup>* : *<sup>V</sup>* <sup>→</sup> -<sup>∞</sup>(*X*,*W*) by *F*(*v*) = *fv*. We now show that the graph *G*(*F*) is closed: if *vn* → *v* in *V* and *Fvn* → *g* in -<sup>∞</sup>(*X*,*W*), then since uniform convergence implies pointwise convergence, for each *x* ∈ *X* we have

$$g(\mathbf{x}) = \lim\_{n} (F\nu\_n)(\mathbf{x}) = \lim\_{n} f\_{\nu\_n}(\mathbf{x}) = \lim\_{n} a\_n \nu\_n = a\_x \lim\_{n} \nu\_n = a\_x \nu = f\_\nu(\mathbf{x}) = (F\nu)(\mathbf{x}).$$

Thus *g* = *Fv*, and hence *G*(*F*) closed. By Theorem B.37, *F* is bounded, so that:

$$\begin{aligned} \|F\| &= \sup\{ \|f\_\nu\|\_{\infty}, \nu \in V, \|\nu\| = 1 \} = \sup\{ \|a\_\mathbf{x}\nu\|, \nu \in V, \|\nu\| = 1, x \in X \} \\ &= \sup\{ \|a\_\mathbf{x}\|, x \in X \} < \infty. \end{aligned}$$

This gives part 1 of Proposition B.77: since lim*<sup>n</sup> an*ψ exists, sup*n*{*an*ψ} < ∞ for each ψ, hence sup*n*{*an*} < ∞. Since *an*ψ → *a*ψ implies *an*ψ→*a*ψ, cf. (B.5),

$$||a\Psi|| = \lim\_{n} ||a\_{n}\Psi|| \le \lim\_{n} ||a\_{n}|| ||\Psi|| \le \sup\_{n} \{||a\_{n}||\} ||\Psi||,\tag{B.250}$$

so taking the supremum over all unit vectors ψ gives *a* < ∞.

As to the second part, suppose *an* → *a* weakly. Since (ϕ,*an*ψ) converges for ϕ,ψ ∈ *H*, we have sup*n*{|ϕ,*an*ψ|} < ∞. Using (B.222), this is the same as sup*n*{| *fan*ψ(ϕ)|} < ∞ for each ϕ ∈ *H*, so using Banach–Steinhaus with *V* = *H*∗, *X* = N, and *ax* = *fan*ψ, we find sup*n*{ *fan*ψ*H*<sup>∗</sup> } < ∞. By Theorem B.66, this gives sup*n*{*an*ψ} < ∞, and hence, via a second application of Theorem (B.78), sup*n*{*an*} < ∞, or *an* < *C* < ∞ for all *n*, as in the case of strong limits.

This time we have to do a little more work to construct the limit operator *a*. This requires a second lemma, which generalizes Proposition A.23 to general Hilbert spaces. To this effect, we say that a sesquilinear form *B* : *H* ×*H* is *bounded* if there is a finite constant *C* such that |*B*(ϕ,ψ)| ≤ *C*ϕψ for all ϕ,ψ ∈ *H*.

Proposition B.79. *The relation B*(ϕ,ψ) = ϕ,*a*ψ *provides a bijective correspondence between* bounded *(hermitian/positive) sesquilinear forms and* bounded *(selfadjoin/positive) operators a* ∈ *B*(*H*)*, cf. Proposition A.22.1.*

Like Proposition A.23, this is a trivial consequence of Theorem B.66.

To finish the proof of Proposition B.77.2, define *B*(ϕ,ψ) = lim*n*ϕ,*an*ψ, so

$$|B(\mathfrak{g}, \Psi)| \le \lim\_{n} ||a\_n|| ||\mathfrak{g}|| ||\Psi|| \le \sup\_n ||a\_n|| ||\mathfrak{g}|| ||\Psi|| \le C ||\mathfrak{g}|| ||\Psi||. \tag{B.251}$$

Hence *B* is bounded, and Proposition B.79 gives the weak limit *a* ∈ *B*(*H*). -

#### B.14 Basic spectral theory

In linear algebra, which in our context means the theory of operators on finitedimensional Hilbert spaces *H*, the spectrum σ(*a*) of an operator (i.e., a linear map) *a* : *H* → *H* was defined as the set of eigenvalues of *a*. This led to the Spectral Theorems A.10 and A.15. However, as soon as dim(*H*) = ∞, simple examples show that even bounded operators may have no eigenvectors (and hence no eigenvalues) at all. For example, take *H* = *L*2(0,1) and *f*(*x*) = *x*, with associated (bounded) multiplication operator *a* = *mf* ≡ *mx*, cf. (B.238); this is just a bounded version of the position operator of quantum mechanics. Then the eigenvalue equation *ax*ψ = λψ implies 1 <sup>0</sup> *dx* |*x* − λ| <sup>2</sup>|ψ(*x*)<sup>|</sup> <sup>2</sup> <sup>=</sup> 0, which holds iff <sup>|</sup>*<sup>x</sup>* <sup>−</sup> <sup>λ</sup>||ψ(*x*)<sup>|</sup> <sup>=</sup> 0 a.e. Since <sup>|</sup>*<sup>x</sup>* <sup>−</sup> <sup>λ</sup><sup>|</sup> is nonzero a.e. for any <sup>λ</sup> <sup>∈</sup> <sup>C</sup>, this implies <sup>ψ</sup>(*x*) = 0 a.e. and hence <sup>ψ</sup> <sup>=</sup> 0 in *<sup>L</sup>*2(0,1). More generally, taking *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(R*d*) and *<sup>f</sup>* <sup>∈</sup> *Cb*(R*d*), a similar argument shows that the multiplication operator *mf* has eigenvalue λ ∈ C whenever the equality *f*(*x*) = λ holds on a set of positive (Lebesgue) measure. Therefore, if *f* varies sufficiently, then *mf* has no eigenvalues at all (e.g., in *<sup>d</sup>* <sup>=</sup> 1, *<sup>f</sup>* <sup>∈</sup>*C*(1) ([0,1]) with *f* (*x*) = 0 a.e.).

Even amidst his magnificent *oeuvre*, covering most of mathematics, it was one of Hilbert's most prophetic insights that finite-dimensional spectral theory could not merely be rescued, but also greatly enriched, by defining the spectrum as follows:

Definition B.80. *Let H be a Hilbert space. The* spectrum σ(*a*) *of a* ∈ *B*(*H*) *consists of all* λ ∈ C *for which the operator a*−λ : *H* → *H is* not *bijective. The complement*

$$\rho(a) = \mathbb{C} \backslash \sigma(a) \tag{\mathbb{B}.252}$$

*of the spectrum in* C *is called the* resolvent *of a, i.e., z* ∈ ρ(*a*) *iff a*−*z is invertible.*

Here *a*−λ ≡ *a*−λ · 1*H*, where 1*<sup>H</sup>* is the unit operator on *H*, and by 'bijective' and 'invertible' we *a priori* mean: injective and surjective. This set-theoretic notion of invertibility is considerably strengthened by Corollary B.35, according to which the set-theoretic inverse of *a*−λ : *H* → *H*, if it exists for *a* ∈ *B*(*H*), is automatically in *B*(*H*). Consequently, we may equivalently say that λ ∈ σ(*a*) if *a*−λ is not invertible *in B*(*H*). This means that if *z* ∈ ρ(*a*), then the equation (*a*−*z*)ψ = ϕ for ψ ∈ *H*:


Thus Definition B.80 becomes a special case of the following purely algebraic idea:

Definition B.81. *Let A be a (complex) algebra with unit. The* spectrum σ(*a*) *of a* ∈ *A consists of all* λ ∈ C *for which the operator a*−λ *is* not *invertible in A.*

The notation (B.252) also extends to this case. This generalization is especially powerful when *A* is a Banach algebra, and, particularly a C\*-algebra, cf. Definition C.1. The latter case actually incorporates Definition B.80:

Proposition B.82. *For any Hilbert space H, the set B*(*H*) *of all bounded operators on H is a C\*-algebra with unit in the operator norm* (A.18)

The proof of Proposition A.7 goes through unchanged. In a different direction:

Proposition B.83. *Let A* = *C*(*X*)*, where X is a compact Hausdorff space. Then*

$$
\sigma(f) = \text{ran}(f). \tag{\text{B.253}}
$$

*Proof.* Since multiplication in*C*(*X*) is pointwise, if *f* −λ ·1*<sup>X</sup>* has an inverse, it must be 1/(*f* −λ · 1*<sup>X</sup>* ). This function exists (and is continuous) iff λ ∈/ ran(*f*). -

Theorem B.84. *Let A* = *B*(*H*) *or, more generally, a unital C\*-algebra, or, even more generally, a Banach algebra with unit* 1*<sup>A</sup> (cf. Definition C.1). Then the spectrum* σ(*a*) *of any a* ∈ *A is a nonempty compact subset of* C*.*

*Furthermore, defining the* spectral radius *of a* ∈ *A by*

$$r(a) = \sup\{ |\lambda|, \lambda \in \sigma(a) \},\tag{B.254}$$

*for general unital Banach algebras we have*

$$r(a) \le \|a\|,\tag{B.255}$$

*as well as Gelfand's* spectral radius formula

$$r(a) = \lim\_{n \to \infty} \left\| a^n \right\|^{1/n}. \tag{B.256}$$

*If a* ∈ *A*sa *is a self-adjoint element of a unital C\*-algebra, such as A* = *B*(*H*)*, then*

$$r(a) = ||a|| \ \ (a^\* = a). \tag{B.257}$$

*Proof.* The claim about the spectrum obviously follows from the following facts:


Eq. (B.255) is equivalent to the implication |λ| > *a* ⇒ λ ∈ ρ(*a*). For λ = 0 we have (*a* − λ) = λ((*a*/λ) − 1), so, rescaling *a* if necessary, we only need to show that if *a* <sup>&</sup>lt; 1, then 1 <sup>∈</sup> <sup>ρ</sup>(*a*). Indeed, in that case the geometric series <sup>∑</sup>*<sup>k</sup> ak* for *<sup>a</sup>* converges absolutely and hence (*A* being a Banach space) converges, with

$$\sum\_{k=0}^{n} a^k = (1 - a)^{-1};\tag{B.258}$$

the proof is virtually the same as for complex numbers. Thus 1 ∈ ρ(*a*).

Fact 2 is equivalent to the set *A*<sup>∗</sup> of of invertible elements in *A* being open in *A*. Indeed, for given *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*∗, take a *<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>* for which *b* <sup>&</sup>lt; *a*−<sup>1</sup>−1. This implies

$$\|\|a^{-1}b\|\| \le \|a^{-1}\|\|\|b\|\| < 1. \tag{B.259}$$

Hence by (B.258) for *a* <sup>&</sup>lt; 1, the operator *<sup>a</sup>* <sup>+</sup> *<sup>b</sup>* <sup>=</sup> *<sup>a</sup>*(<sup>1</sup> <sup>+</sup> *<sup>a</sup>*−1*b*) has an inverse, namely (<sup>1</sup> <sup>+</sup> *<sup>a</sup>*−1*b*)−1*a*−1. Taking <sup>ε</sup> ≤ *a*−<sup>1</sup>−1, it follows that all *<sup>c</sup>* <sup>∈</sup> *<sup>A</sup>* for which *a*−*c* < ε lie in *A*<sup>∗</sup> (which is therefore an open subset of the metric space *A*).

For the third claim, take *a* ∈ *A* and define *f* : C → *A* by *f*(*z*) = *z*−*a*. Since

$$\|f(z+\delta) - f(z)\| = \delta,$$

we see that *f* is continuous (take δ = ε in the definition of continuity). By part 2 of the proof, *<sup>f</sup>* <sup>−</sup>1(*A*∗) is open in <sup>C</sup>. But *<sup>f</sup>* <sup>−</sup>1(*A*∗) is the set of all *<sup>z</sup>* <sup>∈</sup> <sup>C</sup> where *<sup>z</sup>*−*<sup>a</sup>* has an inverse, so that *<sup>f</sup>* <sup>−</sup>1(*A*∗) = <sup>ρ</sup>(*a*). This set being open, its complement <sup>σ</sup>(*a*) is closed. Now define

$$\text{g}: \mathfrak{p}(a) \to A;\tag{\text{B.260}}$$

$$z \mapsto (z - a)^{-1}. \tag{B.261}$$

For fixed *<sup>z</sup>*<sup>0</sup> <sup>∈</sup> <sup>ρ</sup>(*a*), choose *<sup>z</sup>* <sup>∈</sup> <sup>C</sup> such that <sup>|</sup>*z*−*z*0<sup>|</sup> <sup>&</sup>lt; (*a*−*z*0)−1<sup>−</sup>1. From part 2 of the proof, with *a* replaced by *a*−*z*<sup>0</sup> and *c* replaced by *a*−*z*, we see that *z* ∈ ρ(*a*), as *a*−*z*<sup>0</sup> −(*a*−*z*) = |*z*−*z*0|. Moreover, because

$$\|(z\_0 - z)(z\_0 - a)^{-1}\| = |z\_0 - z| \, | \|(z\_0 - a)^{-1}\| < 1,\tag{B.262}$$

the power series

$$\frac{1}{z\_0 - a} \sum\_{k=0}^{n} \left( \frac{z\_0 - z}{z\_0 - a} \right)^k \tag{B.263}$$

is absolutely convergent and hence convergent for *n* → ∞. By (B.258), the limit *n* → ∞ of this power series is

$$\frac{1}{z\_0 - a} \sum\_{k=0}^{\infty} \left( \frac{z\_0 - z}{z\_0 - a} \right)^k = \frac{1}{z\_0 - a} \left( 1 - \left( \frac{z\_0 - z}{z\_0 - a} \right) \right)^{-1} = \frac{1}{z - a} = \mathbf{g}(z). \qquad (\mathbf{B.264})$$

Hence

$$\log(z) = \sum\_{k=0}^{\infty} (z\_0 - z)^k (z\_0 - a)^{-k-1} \tag{B.265}$$

is a norm-convergent power series. For *z* = 0 we write *g*(*z*) = |*z*| <sup>−</sup>1(1*A*−*a*/*z*)−1 and observe that lim*z*→<sup>∞</sup> 1*<sup>A</sup>* − *a*/*z* = 1*A*, since lim*z*→<sup>∞</sup> *a*/*z* = 0. Hence we obtain lim*z*→∞(1*<sup>A</sup>* <sup>−</sup>*a*/*z*)−<sup>1</sup> <sup>=</sup> <sup>1</sup>*A*, and

$$\lim\_{z \to \infty} \|g(z)\| = 0.\tag{B.266}$$

Let ϕ ∈ *A*∗; since ϕ is bounded, eq. (B.265) implies that the function *g*<sup>ϕ</sup> : *z* → ϕ(*g*(*z*)) is given by a convergent power series (i.e. is analytic), and (B.266) implies

$$\lim\_{z \to \infty} \mathbf{g}\_{\varPhi}(z) = 0.\tag{\text{B.267}}$$

Now suppose that σ(*a*) = 0, so that / ρ(*a*) = C. The function *g*, and hence *g*ϕ, is then defined on C, where it is analytic and vanishes at infinity. In particular, *g*<sup>ϕ</sup> is bounded, so that by Liouville's Theorem of elementary complex analysis it must be constant. By (B.267) this constant is zero, so that *g* = 0 by Corollary B.45. This is absurd, so that ρ(*a*) = C, and hence σ(*a*) = 0./

We now prove the spectral radius formula (B.256). For |*z*| > *a* the function *g*, defined in (B.260) - (B.261) has a norm-convergent power series

$$\log(z) = \frac{1}{z} \sum\_{k=0}^{\infty} \left(\frac{a}{z}\right)^k. \tag{B.268}$$

On the other hand, we have seen that for any *z* ∈ ρ(*a*) one may find a *z*<sup>0</sup> ∈ ρ(*a*) such that the power series (B.265) converges (i.e. in norm). If |*z*| > *r*(*a*) then *z* ∈ ρ(*a*), so (B.265) converges for |*z*| > *r*(*a*), uniformly in *z*. Therefore (by the theory of analytic functions taking values in Banach spaces), eq. (B.268) is norm-convergent for <sup>|</sup>*z*<sup>|</sup> <sup>&</sup>gt; *<sup>r</sup>*(*a*), too, which in turn implies that *an*/|*z*<sup>|</sup> *<sup>n</sup>* < 1 for large enough *n* (proof by contradiction). Since this is true for all *z* for which |*z*| > *r*(*a*), we must have

$$\limsup\_{n \to \infty} \|a^n\|^{1/n} \le r(a). \tag{B.269}$$

To derive a second inequality towards (B.256), we use the *spectral mapping property* for polynomials, which states that for any (complex) polynomial *p* on C,

$$\sigma(p(a)) = p(\sigma(a)) \equiv \{ p(\lambda) \mid \lambda \in \sigma(a) \}. \tag{B.270}$$

Given some polynomial *p* of degree *n* (in a variable *z*) and some fixed λ ∈ C, let

$$q(z) = p(z) - \lambda = c\_0 \prod\_{k=1}^{n} (z - c\_k), \tag{B.271}$$

for some *<sup>c</sup>*0,..., *ck* <sup>∈</sup> <sup>C</sup>. Hence by (A.53) - (A.55), we have *<sup>q</sup>*(*a*) = *<sup>c</sup>*<sup>0</sup> <sup>∏</sup>*<sup>n</sup> <sup>k</sup>*=1(*a*−*ck*). Now an operator *b* = *b*<sup>0</sup> ···*bn* is invertible iff each factor *bk* is invertible (in which case *b*−<sup>1</sup> = *b*−<sup>1</sup> *<sup>n</sup>* ···*b*−<sup>1</sup> <sup>0</sup> ), so λ ∈ σ(*p*(*a*)) iff some *ck* ∈ σ(*a*) (where *k* > 0, as *c*<sup>0</sup> = 0), which is true iff *q*(*ck*) = 0, which holds iff λ = *p*(*ck*). This proves (B.270).

To conclude the proof of (B.256), we note that since σ(*a*) is closed, there is <sup>λ</sup> <sup>∈</sup> <sup>σ</sup>(*a*) for which <sup>|</sup>λ<sup>|</sup> <sup>=</sup> *<sup>r</sup>*(*a*). Since <sup>λ</sup> *<sup>m</sup>* <sup>∈</sup> <sup>σ</sup>(*am*) by (B.270), one has <sup>|</sup><sup>λ</sup> *<sup>m</sup>*|≤*am* by (B.255). Hence *am*1/*<sup>m</sup>* ≥ |λ<sup>|</sup> <sup>=</sup> *<sup>r</sup>*(*a*). Combining this with (B.269) yields

$$\limsup\_{n \to \infty} \|a^n\|^{1/n} \le r(a) \le \|a^m\|^{1/m} \text{ ( $m \in \mathbb{N}$ )}.\tag{B.272}$$

Hence the limit must exist, and lim*n*→<sup>∞</sup> *an*1/*<sup>n</sup>* <sup>=</sup> inf*<sup>m</sup> am*1/*<sup>m</sup>* <sup>=</sup> *<sup>r</sup>*(*a*), i.e., (B.256).

Finally, given axiom (C.2) for C\*-algebras (which include *B*(*H*) by Proposition A.7 and Theorem B.33), eq. (B.257) follows from (B.256): for self-adjoint *a*, eq. (C.2) reads *a*2 <sup>=</sup> *a*2, so if we take the limit in (B.256) along the subsequence of even numbers (as we are entitled to, given convergence), we obtain (B.257). -

We may also generalize Definition B.80 in a different direction, where we allow *a* : *D*(*a*) → *H* to be unbounded. In that case, there is room for some ambiguity, as a possible set-theoretic inverse of *a* − *z*, if it exists as a (necessarily linear) map (*<sup>a</sup>* <sup>−</sup> *<sup>z</sup>*)−<sup>1</sup> : *<sup>H</sup>* <sup>→</sup> *<sup>D</sup>*(*a*) is no longer guaranteed to be bounded. By the argument preceding Definition B.81 this would, of course, be desirable, which motivates:

Definition B.85. *Let H be a Hilbert space, and let a* : *D*(*a*) → *H be a possibly unbounded operator (always by definition with dense domain).*


This provides further motivation for requiring an unbounded operator to be closed:

Proposition B.86. *Let a* : *D*(*a*) → *H be a possibly unbounded operator.*


*Proof.* The graph *<sup>G</sup>*(*a*−1) in *<sup>H</sup>* <sup>⊕</sup> *<sup>H</sup>* is the image of *<sup>G</sup>*(*a*) under the linear homeomorphism (ψ1,ψ2) → (ψ2,ψ1), hence if *<sup>a</sup>* is closed, then *<sup>a</sup>*−<sup>1</sup> is closed and hence bounded (cf. Theorem B.37). Similarly, if *G*(*a*) is not closed, then *G*(*a*−1) cannot be closed either, and hence *<sup>a</sup>*−<sup>1</sup> cannot be bounded. Likewise with *<sup>a</sup> <sup>a</sup>*−*z*. -

Thus spectral theory always deals with *closed* operators *a*, like self-adjoint ones.

We now show that Definition B.80 is compatible with our earlier §A.4.

Proposition B.87. *Let V be a finite-dimensional vector space and let a* : *V* → *V be a linear map. Then a is injective iff it is surjective.*

*Proof.* This follows from the elementary fact that for any linear map *a* : *V* →*W* one has ran(*a*) ∼= *V*/ker(*a*). Now if *V* = *W* is finite-dimensional one has *V* ∼= C*<sup>n</sup>* (on choice of a basis), and one may simply count dimensions to infer that

dim(ran(*a*)) = *n*−dim(ker(*a*)).

Surjectivity of *a* then yields injectivity and *vice versa*: we have dim(ran(*a*)) = *n* iff dim(ker(*a*)) = 0 iff ker(*a*) = 0. -

Note that his proposition yields the very simplest case of the *Atiyah–Singer index theorem*, for which these mathematicians received the Abel Prize for 2004. We define the index of a linear map *a* : *V* → *W* as

$$\text{index}(a) = \text{dim}(\text{ker}(a)) - \text{dim}(\text{coker}(a)), \tag{B.273}$$

where cokern(*a*) = *W*/ran(*a*), *provided both quantities are finite*. If *V* and *W* are finite-dimensional, Proposition B.87 yields the *baby index theorem*

$$\text{index}(a) = \dim(V) - \dim(W). \tag{B.274}$$

In particular, if *V* = *W*, then index(*a*) = 0 for any linear map *a* (in general, the index theorem expresses the index of an operator in terms of topological data; in this simple situation the only such data are the dimensions of *V* and *W*).

Corollary B.88. *If a is an operator on a finite-dimensional Hilbert space, then the spectrum* σ(*a*) *of a is the set of its eigenvalues.*

*Proof.* It immediately follows from Proposition B.87 that *a*−*z* is invertible iff *z* is *not* an eigenvalue of *a*. -

Returning to Definition B.80, we see that if λ is an eigenvalue of *a* (in that, as in finite dimension, there exists a nonzero vector ψ ∈ *H* for which *a*ψ = λψ), then λ ∈ σ(*a*) (for *a*−λ) is not even injective, let alone invertible). Thus we may define:


$$
\sigma\_c(a) = \sigma(a) \backslash \sigma\_p(a). \tag{B.275}
$$

If σ(*a*) = σ*p*(*a*), we call σ(*a*) *discrete*. The example at the beginning of this section shows the opposite case, viz. σ*p*(*ax*) = 0 and / σ*c*(*ax*)=[0,1]. This follows from:

Proposition B.89. *Let H* = *L*2(*X*,Σ,μ) *for some* σ*-finite Borel space* (*X*,Σ,μ) *such that* μ(*A*) > 0 *for each open A* ⊂ *X, and let f* ∈ *C*(*X*)*. Then*

$$
\sigma(m\_f) = \text{ran}(f)^-.\tag{B.276}
$$

*Cf. Proposition B.73. More generally, let f* : *X* → C *be (Borel) measurable. Then*

$$\sigma(m\_f) = \text{ess-ran}(f),\tag{B.277}$$

*wgere the* essential range ess*-*ran(*f*) *of f consists of all z* ∈ C *such that*

$$\forall \mathfrak{e} > 0: \mu(\{x \in X : |f(x) - z| < \mathfrak{e}\}) > 0. \tag{B.278}$$

*Proof.* The second claim implies the first, for ess-ran(*f*) = ran(*f*)<sup>−</sup> if *f* ∈ *C*(*X*).

To prove the second claim, we use the functions ϕ*<sup>n</sup>* = 1*X*˜*n*ψ from the proof of Proposition B.73, where ψ ∈ *H* is arbitrary. If 0 ∈/ σ(*mf*), then *mf* is invertible, so there is *<sup>b</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*) such that *f b*ϕ*<sup>n</sup>* <sup>=</sup> <sup>ϕ</sup>*n*. This implies that *<sup>f</sup>*(*x*) <sup>=</sup> 0 a.e. on *<sup>X</sup>*˜*n*, with *<sup>b</sup>*ϕ*<sup>n</sup>* <sup>=</sup> *<sup>m</sup>*1/ *<sup>f</sup>* <sup>ϕ</sup>*n*. Because *<sup>n</sup>* <sup>∈</sup> <sup>N</sup> is also arbitrary and *<sup>X</sup>* <sup>=</sup> <sup>∪</sup>*nX*˜*n*, this gives *<sup>f</sup>*(*x*) <sup>=</sup> <sup>0</sup> a.e. on *X*, and since the linear span of the ϕ*<sup>n</sup>* is dense in *H*, we obtain *b* = *m*1/ *<sup>f</sup>* , provided *b* = *m*−<sup>1</sup> *<sup>f</sup>* exists (which should not surprise us, for *mf mg* = *mf g*). From (B.240), with *<sup>f</sup>* <sup>1</sup>/ *<sup>f</sup>* , we then obtain 1/ *<sup>f</sup>* ess <sup>∞</sup> = *m*1/ *<sup>f</sup>* < ∞ (from 0 ∈ ρ(*mf*)).

The point is that 1/ *<sup>f</sup>* ess <sup>∞</sup> < ∞ iff there is ε > 0 such that | *f*(*x*)| ≥ ε almost everywhere, i.e., μ({*x* ∈ *X* : | *f*(*x*)| < ε}) = 0. The negation of this condition states that ∀ε > 0 : μ({*x* ∈ *X* : | *f*(*x*)| < ε}) > 0, that is, 0 ∈ ess-ran(*f*). Therefore, we have shown that 0 ∈ σ(*mf*) iff 0 ∈ ess-ran(*f*); if *f* ∈*C*(*X*), this is the same as 0 ∈ ran(*f*)−.

To finish, note that *mf* −λ ·1*<sup>H</sup>* = *mf*−<sup>λ</sup> , where *f* −λ is the function *x* → *f*(*x*)−λ. This gives λ ∈ σ(*mf*) iff 0 ∈ σ(*mf*−<sup>λ</sup> ), which is true iff λ ∈ ess-ran(*f*). - Corollary B.90. *If* μ(*f* = λ) = 0 *for all* λ ∈ C*, then* σ*p*(*mf*) = 0/*.*

Thus the combination σ*p*(*a*) = 0 and / σ*c*(*a*) = 0, which is the opposite of the finite- / dimensional situation, is very well possible. To shed further light on the still somewhat mysterious idea of a continuous spectrum, we now present Weyl's theory of the spectrum. We say that a possibly unbounded operator *a* : *D*(*a*) → *H* is *normal* when *D*(*a*∗) = *D*(*a*) and *a*∗ψ = *a*ψ for each ψ ∈ *D*(*a*); if *a* is bounded, this is equivalent to the familiar definition *a*∗*a* = *aa*∗. Self-adjoint operators are normal.

Theorem B.91. *Let a* : *D*(*a*) → *H be normal. Then* λ ∈ σ(*a*) *iff there exists a sequence* (ψ*n*) *of unit vectors in D*(*a*) *such that*

$$\lim\_{n \to \infty} \left\| \left( a - \lambda \right) \Psi\_n \right\| = 0. \tag{B.279}$$

Of course, this is useful only as a new characterization of λ ∈ σ*c*(*a*); if λ ∈ σ*p*(*a*) one may simply take ψ*<sup>n</sup>* = ψ for all *n*, where *a*ψ = λψ. For a simple example, take

$$H = L^2(\mathbb{R});\tag{\mathbb{B}.280}$$

$$a = m\_f \ (f \in \mathcal{C}(\mathbb{R})),\tag{B.281}$$

$$\mathcal{A} = f(\mathbf{x}\_0) \ (\mathbf{x}\_0 \in \mathbb{R}), \tag{\mathbb{B}.282}$$

so that λ ∈ ran(*f*) ⊂ σ*c*(*mf*) = σ(*mf*), and

$$\Psi\_n(\mathbf{x}) = (n/\pi)^{1/4} e^{-n(\mathbf{x}-\mathbf{x\_0})^2/2}. \tag{\mathbf{B.283}}$$

Then ψ*n* <sup>=</sup> 1 and lim*<sup>n</sup>* (*mf* <sup>−</sup>λ)ψ*n* <sup>=</sup> 0, although (ψ*n*) has no limit in *<sup>L</sup>*2(R).

*Proof.* One direction is easy by *reductio ad absurdum*: if the given sequence (ψ*n*) exists yet <sup>λ</sup> <sup>∈</sup> <sup>ρ</sup>(*a*), then, since (*<sup>a</sup>* <sup>−</sup> <sup>λ</sup>)−<sup>1</sup> would exist and would be bounded, for any sequence (ϕ*n*) in *<sup>H</sup>*, <sup>ϕ</sup>*<sup>n</sup>* <sup>→</sup> 0 implies (*a*−λ)−1ϕ*<sup>n</sup>* <sup>→</sup> 0, so taking <sup>ϕ</sup>*<sup>n</sup>* = (*a*−λ)ψ*n*, we find that (*a* − λ)ψ*<sup>n</sup>* → 0 implies ψ*<sup>n</sup>* → 0. Therefore, the assumption ψ*n* = 1 cannot be true, and hence λ ∈/ ρ(σ(*a*), which is to say that λ ∈ σ(*a*).

The converse direction requires two instructive lemmas of independent interest.

Lemma B.92. *Let a* ∈ *B*(*H*) *(or, more generally, let a* : *D*(*a*) → *H be closed). Then*

$$\text{ran}(a)^{-} = \text{ker}(a^\*)^{\perp};\tag{\text{B.284}}$$

$$\text{ran}(a)^\perp = \text{ker}(a^\*).\tag{\text{B.285}}$$

*In particular, we have* ran(*a*)<sup>−</sup> = *H iff* ker(*a*∗) = {0}*.*

*Furthermore, we say that a is* norm-positive *(a neologism!) if there exists* α > 0 *such that a*ψ ≥ αψ *for each* ψ ∈ *H (or each* ψ ∈ *D*(*a*)*). Then:*


The last point provides the remainder of the proof of Theorem B.91, for if λ ∈ σ(*a*), then *a* − λ is *not* invertible, so for each ε = 1/*n* there is a unit vector ψ*<sup>n</sup>* ∈ *H* (or ψ ∈ *D*(*a*)) such that (*a*−λ)ψ*n* < 1/*n*, and hence we have our sequence (ψ*n*). -

It remains to prove Lemma B.92. Eqs. (B.284) - (B.285) are easy exercises, using (B.204). For clause 1, if (ϕ*n*) is a Cauchy sequence in ran(*a*) converging to ϕ ∈ *H*, then <sup>ϕ</sup>*<sup>n</sup>* <sup>=</sup> *<sup>a</sup>*ψ*<sup>n</sup>* for some <sup>ψ</sup>*<sup>n</sup>* <sup>∈</sup> *<sup>D</sup>*(*a*). Since ψ*<sup>m</sup>* <sup>−</sup> <sup>ψ</sup>*n* ≤ <sup>α</sup>−1ϕ*<sup>n</sup>* <sup>−</sup> <sup>ϕ</sup>*m*, the sequence (ψ*n*) is Cauchy, too, and if ψ*<sup>n</sup>* → ψ, then ϕ*<sup>n</sup>* → *a*ψ = ϕ, so ϕ ∈ ran(*a*); in the unbounded case this is because *a* is closed. For clause 2, if *a* is invertible, then for <sup>ψ</sup> <sup>∈</sup> *<sup>D</sup>*(*a*), we have ψ <sup>=</sup> *a*−1*a*ψ≤*a*−1*a*ψ, since *<sup>a</sup>*−<sup>1</sup> is bounded, and therefore *<sup>a</sup>* is norm-positive with (for example) <sup>α</sup> <sup>=</sup> *a*−<sup>1</sup>−1. Moreover, invertibility implies surjectivity, i.e., ran(*a*) = *H*, and hence ker(*a*∗) = {0} by (B.284).

Conversely, if *a* is norm-positive, then it is trivially injective, and if ker(*a*∗) = {0}, then ran(*a*)<sup>−</sup> = *H*, again by (B.284). But since *a* is also norm-positive, ran(*a*)− = ran(*a*) so ran(*a*) = *H* and *a* is surjective, too. Clause 3 now also follows, since for normal operators *a* we have ker(*a*) = ker(*a*∗), so *a* being norm-positive implying ker(*a*∗) = {0} in any case, now also implies ker(*a*) = {0}. -

The same lemma yields crucial information on spectra of self-adjoint operators.

Theorem B.93. *If a* : *D*(*a*) → *H is self-adjoint, then* σ(*a*) ⊆ R*, and if two eigenvalues* λ,λ ∈ σ*p*(*a*) *are different, then corresponding eigenvectors are orthogonal. Furthermore, for each z* ∈ C *exactly one of the following possibilities applies:*


*Proof.* If *<sup>a</sup>*<sup>∗</sup> <sup>=</sup> *<sup>a</sup>* then ψ,*a*ψis real, so |ψ,(*a*−*z*)ψ| ≥ |Im(*z*)|ψ<sup>2</sup> for any *<sup>z</sup>* <sup>∈</sup> <sup>C</sup>. Combined with Cauchy–Schwarz, this gives the inequality

$$\|\|(a-z)\Psi\|\| \ge |\text{Im}(z)| \|\|\Psi\|\|. \tag{B.286}$$

Therefore, for *z* ∈ C\R the normal operator *a*−*z* is norm-positive, and hence invertible by Lemma B.92.3, so that σ(*a*) ⊆ R. Next, if *a*ψ = λψ and *a*ψ = λ ψ ,

$$\langle \Psi | \Psi , \Psi' \rangle = \frac{1}{\lambda - \lambda'} (\langle \lambda \Psi , \Psi' \rangle - \langle \Psi , \lambda' \Psi' \rangle) = \frac{1}{\lambda - \lambda'} \langle \Psi , (a^\* - a) \Psi' \rangle = 0. \tag{B.287}$$

given that λ,λ ∈ R and assuming λ = λ and *a*<sup>∗</sup> = *a*.

Furthermore, for *z* ∈ C\R, we have *z* ∈ ρ(*a*) and hence trivially ran(*a* − *z*) = *H*; conversely, the latter property states surjectivity of *a* − *z*, whilst (B.286) yields injectivity, so jointly, *z* ∈ ρ(*a*). For *z* ∈ R, assuming ran(*a* − *z*) = *H*, eq. (B.285) yields ker(*a*<sup>∗</sup> −*z*) = {0}, but since *a*<sup>∗</sup> = *a* and *z* = *z*, this is just injectivity of *a*−*z*, whence once more *z* ∈ ρ(*a*). Similarly, if *z* ∈ R, then ran(*a*−*z*)<sup>−</sup> = *H* iff ker(*a*−*z*) = {0}, which yields the third case *z* ∈ σ*p*(*a*). The middle case is all that remains. -

This result reconfirms Corollary B.88 to the effect that continuous spectrum cannot occur if dim(*H*) < ∞, since in that case (where linear subspaces are automatically closed) the second scenario in Theorem B.93 is impossible.

#### B.15 The spectral theorem

Although he did not live to see it, on Hilbert's viosnary Definition B.80 of the spectrum, part 1 of Theorem A.15 still holds *verbatim* even if *H* is infinite-dimensional:

Theorem B.94. *Let H be a Hilbert space, suppose a* ∈ *B*(*H*) *is self-adjoint, and let C*∗(*a*) *be the C\*-algebra generated within B*(*H*) *by a and* 1*<sup>H</sup> (that is, the intersection of all C\*-algebras containing a and* 1*H). Then C*∗(*a*) *is commutative, and there is a (necessarily isometric) isomorphism of (commutative) C\*-algebras*

$$\mathcal{C}(\sigma(a)) \xrightarrow{\cong} \mathcal{C}^\*(a), \; f \mapsto f(a), \tag{B.288}$$

*which is unique if it is subject to the following conditions:*


The map *f* → *f*(*a*) is called the *continuous functional calculus*. In particular,

$$(tf+g)(a) = tf(a) + g(a);\tag{B.289}$$

$$(fg)(a) = f(a)g(a);\tag{B.290}$$

$$f(a)^{\*} = f^{\*}(a). \tag{\mathbb{B}.291}$$

It is worth mentioning that by Theorem C.62 (cf. Appendix C) an isomorphism of C\*-algebras is automatically isometric, but in this case the equality

$$\|f(a)\| = \|f\|\_{\infty},\tag{\text{B.292}}$$

acts as a lemma in the proof that (B.288) is an isomorphism, so we need to prove it explicitly; cf. (B.225) for the left-hand side, and (1.24) for the right-hand side.

Note that Theorem B.94 is even true for the larger class *normal* bounded operators *a* (which might even be *defined* by the property that *C*∗(*a*) is commutative), but for applications to quantum mechanics it is sufficient to deal with the self-adjoint case (which even mathematically is not a restriction, as it implies the normal case).

*Proof.* We repeat (A.52) and (A.53) - (A.55), obtaining a map *f* → *f*(*a*) defined for *polynomials f* on R, restricted to σ(*a*) ⊂ R. The <sup>∗</sup>-algebra *P*∗(*a*) of all polynomials in *a* is dense in *C*∗(*a*) by definition of the latter, since one cannot have a smaller C\*-algebra in *B*(*H*) containing *a* and 1*<sup>H</sup>* than the norm-closure of *P*∗(*a*). In order to take advantage of this, we need the following lemma.

Lemma B.95. *For any a* ∈ *B*(*H*) *and any polynomial p on* C*, we have*

$$\sigma(p(a)) = p(\sigma(a)) \equiv \{ p(\lambda) \mid \lambda \in \sigma(a) \};\tag{B.293}$$

$$\|a\| = \sqrt{r(a^\*a)},\tag{\text{B.294}}$$

*see* (B.254)*. In particular, if a*<sup>∗</sup> = *a, then a* = *r*(*a*)*, cf.* (B.257)*.*

This is part of Theorem B.84, but we now give a direct proof of the second part. We first note that if *a*<sup>∗</sup> = *a*, then either *a* or −*a* (or both) are in σ(*a*). To show this, take a sequence (ψ*n*) of unit vectors in *H* such that lim*<sup>n</sup> a*ψ*n* = *a*. Then

$$\begin{split} \left\|(a^2 - \|a\|^2)\Psi\_n\right\|^2 &= \left\langle (a^2 - \|a\|^2)\Psi\_n, (a^2 - \|a\|^2)\Psi\_n \right\rangle \\ &= \|a^2\Psi\_n\|^2 + \|a\|^4 - 2\|a\|^2\|a\Psi\_n\|^2 \\ &\le 2\|a\|^4 - 2\|a\|^2\|a\Psi\_n\|^2, \end{split} \tag{B.295}$$

so that lim*<sup>n</sup>* (*a*<sup>2</sup> − *a*2)ψ*n*<sup>2</sup> <sup>=</sup> 0, and hence *a*<sup>2</sup> <sup>∈</sup> <sup>σ</sup>(*a*2) by Theorem B.91. But part 1 of the lemma gives <sup>σ</sup>(*a*2) = {λ<sup>2</sup> <sup>|</sup> <sup>λ</sup> <sup>∈</sup> <sup>σ</sup>(*a*)}, so that ±*a* ∈ <sup>σ</sup>(*a*).

The second observation is that, for general *a* ∈ *B*(*H*), if some *z* ∈ C has |*z*| > *a*, then *z* ∈ ρ(*a*). This follows from (part 1 of) the proof of Theorem B.84. Thus we firstly have *r*(*a*) ≥ *a* (*a*<sup>∗</sup> = *a*), and secondly (for all *a*), *r*(*a*) ≤ *a*.

Using Lemma B.95, we now prove that (B.292) holds for *real* polynomials *f* = *p*:

$$\begin{split} \|p(a)\| = r(p(a)) &= \sup\{ |\lambda|, \lambda \in \sigma(p(a)) \} = \sup\{ |\lambda|, \lambda \in p(\sigma(a)) \} \\ &= \sup\{ |p(\lambda)|, \lambda \in \sigma(a) \} = \|p\|\_{\mathfrak{w}}. \end{split} \tag{B.296}$$

The case of *complex* polynomials *p* follows from this, since, using (B.289) - (B.291),

$$\left\|\left|p(a)\right\|\right\|^2 = \left\|\left|p(a)^\*p(a)\right\|\right\| = \left\|\left|p\right|^2(a)\right\| = \left\|\left|\left|p\right|^2\right\|\_{\\*\*} = \left\|\left|p\right|\right\|\_{\\*\*}^2. \tag{B.297}$$

Thus we have proved the isometric <sup>∗</sup>-algebra isomorphism *P*(σ(*a*)) ∼= *P*∗(*a*), where *P*(σ(*a*)) and *P*∗(*a*) are the canonically normed vector spaces of all finite polynomials in *t* ∈ σ(*a*) and in *a* ∈ *B*(*H*), respectively. Neither is complete (when *H* is infinite-dimensional and *a* = 0), but given isometricity, it is easy to pass to their completions, which by Weierstrass and by definition are *C*(σ(*a*)) and *C*∗(*a*), respectively. Thus for *f* ∈ *C*(σ(*a*)) we find a sequence (*pn*) in *P*(σ(*a*)) such that *pn* → *f* (from which it follows that (*pn*) is Cauchy in *C*(σ(*a*))), and define

$$f(a) = \lim\_{n} p\_n(a);\tag{B.298}$$

this limit exists because *pn*(*a*)− *pm*(*a*) = *pn* − *pm*∞, so that (*pn*(*a*)) is Cauchy in the Banach space *C*∗(*a*). Furthermore, if *p <sup>n</sup>* → *f* , and *f* (*a*) = lim*<sup>n</sup> p <sup>n</sup>*(*a*), then

$$\|\|f(a) - f'(a)\|\| = \lim\_{n} \|\|p\_n(a) - p'\_n(a)\|\| = \lim\_{n} \|\|p\_n - p'\_n\|\|\_{\\*\*} = 0,\qquad(\text{B.299})$$

so *f* (*a*) = *f*(*a*). From (B.296) - (B.298) and continuity of the norm—i.e. *f*(*a*) = lim*<sup>n</sup> pn*(*a*), which gives d (B.292)—the map *f* → *f*(*a*) is isometric and hence injective on *C*(σ(*a*)), and the above construction trivially makes it surjective.

Finally, the properties (B.289) - (B.291) follow from (A.53) - (A.55) by continuity. These properties also imply the uniqueness of the map *f* → *f*(*a*) given the conditions states in the theorem, because these conditions and (A.53) - (A.55) define the map on *P*(σ(*a*)) and hence, by continuity, also on *C*(σ(*a*)). - For a nice reformulation of Theorem B.94 in terms the Gelfand spectrum, cf. §C.4. For later use (cf. Proposition B.98 below) we add a related result.

Lemma B.96. *If a* ∈ *B*(*H*) *is self-adjoint, then*

$$||a|| = \sup\{ |\langle \Psi, a\Psi \rangle|, \Psi \in H, \|\|\Psi\| = 1 \}.\tag{B.300}$$

*In particular, if a*,*b* ∈ *B*(*H*) *are both positive and a* ≤ *b, then a*≤*b.*

*Proof.* Define the *numerical range* ν(*a*) of an arbitrary *a* ∈ *B*(*H*) as

$$\nu(a) = \{ \langle \Psi, a\Psi \rangle, \Psi \in H, \|\|\Psi\| = 1 \}. \tag{B.301}$$

Clearly, if λ ∈ σ*p*(*a*), then λ ∈ ν(*a*). If λ ∈ σ*c*(*a*), then, in the notation of Theorem B.91, by Cauchy–Schwarz and normalization of ψ*<sup>n</sup>* we have

$$|\langle \Psi\_n, (a - \lambda)\Psi\_n \rangle| \le \|(a - \lambda)\Psi\_n\|. \tag{B.302}$$

Hence in view of (B.279) we have

$$\lim\_{n \to \infty} \langle \Psi\_n, a \Psi\_n \rangle = \lambda. \tag{B.303}$$

So λ ∈ ν(σ)−, whence σ(*a*) ⊆ ν(*a*)−, and hence *r*(*a*) ≤ sup{|λ|,λ ∈ ν(*a*)}. From Cauchy–Schwarz, in (B.301) we have |ψ,*a*ψ| ≤ *a*. If also *a*<sup>∗</sup> = *a*, by (B.300),

$$||a|| = r(a) \le \sup\{ |\lambda|, \lambda \in \mathbf{v}(a) \} \le ||a||.$$

Hence we have equalities everywhere, and (B.300) follows. -

Generalizing parts 2 and 3 of Theorem A.15 to the infinite-dimensional case requires some motivation. To this effect, note that the continuous functional calculus *a* → *f*(*a*) is *positive*, i.e., if *f* ≥ 0 pointwise, then *f*(*a*) ≥ 0 in that ψ, *f*(*a*)ψ ≥ 0 for each ψ ∈ *H*. Indeed, we have *f* ≥ 0 iff *f* = *g*∗*g* for some *g* ∈*C*(σ(*a*)), with *g*∗(*x*) = *g*(*x*) as usual, and hence, by (B.290) - (B.291), *f*(*a*) = *g*(*a*)∗*g*(*a*) and therefore ψ, *<sup>f</sup>*(*a*)ψ <sup>=</sup> *g*(*a*)<sup>2</sup> <sup>≥</sup> 0. By Corollary B.17, if <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>* is a *unit* vector, there is a *probability* measure μψ on σ(*a*) such that for each *f* ∈ *C*(σ(*a*)),

$$
\langle \Psi \vert \Psi \vert f(a) \Psi \rangle = \int\_{\sigma(a)} d\mu\_{\Psi} f. \tag{B.304}
$$

The key to the envisaged generalization of Theorem A.15 is that the integral on the right may actually be defined for a far larger class of functions than *C*(σ(*a*)); cf. (B.29). This suggests that the expression *f*(*a*) on the left-hand side should similarly be generalized to a larger class of functions *f* . However, the *L<sup>p</sup>* spaces considered in §B.6 are defined on the basis of some measure μ; since μψ in (B.304) varies with ψ and *f*(*a*) should be independent of ψ, it is appropriate to use the space B(σ(*a*)) of *bounded* functions *f* : σ(*a*) → C that are *measurable* with respect to the Borel σ-algebra on σ(*a*) (which consist of the Borel sets on R intersected with σ(*a*)).

$$\square$$

Since both boundedness and measurability are preserved under uniform limits (measurability even being preserved under pointwise limits), B(σ(*a*)) is complete in the sup-norm, which makes it a commutative C\*-algebra (under pointwise operations). Among all functions in B(σ(*a*)), we will be particularly interested in the characteristic functions 1*A*, where *A* ⊂ σ(*a*) is measurable. The expressions

$$e\_A = 1\_A(a), \ (A \subset \sigma(a));\tag{B.305}$$

$$e\_A = e\_{A \cap \sigma(a)}, \ (A \subset \mathbb{R});\tag{B.306}$$

$$e\_{\lambda} \equiv 1\_{\{\lambda\}}(a), \ (\lambda \in \sigma\_p(a)), \tag{B.307}$$

to be defined below, where *A* is a Borel set (and *e*0/ = 0 by convention), are the *spectral projections* of *a* (which are of fundamental importance to quantum mechanics).

Lemma B.97. *Any* positive *function f* ∈ B(σ(*a*)) *is a* pointwise *limit of some monotone increasing bounded sequence* (*fn*) *in C*(σ(*a*))*, written fn* \$ *f . That is,*

$$0 \le f\_1(\mathbf{x}) \le \dots \le f\_n(\mathbf{x}) \le f\_{n+1}(\mathbf{x}) \le \dots \le c \cdot 1\_{\sigma(a)};\tag{\mathbf{B.308}}$$

$$f(\mathbf{x}) = \lim\_{n \to \infty} f\_n(\mathbf{x}), \ x \in \sigma(a). \tag{B.309}$$

*Proof.* We start with *f* = 1*K*, where *K* ⊆ σ(*a*) is compact. Then *K* = ∩*nUn* for certain open sets *Un* (this is true for any second countable space), and taking "Urysohn" functions *fn* for each *Un* (i.e., *fn* ∈*Cc*(*Un*), 0 ≤ *fn*(*x*) ≤ 1 for *x* ∈ σ(*a*), and *fn*(*x*) = 1 for *x* ∈ *K*), we obviously have *fn* → 1*K*. Next, if *U* ⊂ σ(*a*) is open, we have *U* = ∪*nKn* for suitable compact *Kn* (since R and hence σ(*a*) is σ-compact), so 1*Kn* → 1*<sup>U</sup>* . This also gives 1*<sup>C</sup>* for closed sets *C* = σ(*a*)\*U*, since 1*<sup>C</sup>* = 1σ(*a*) − 1*<sup>U</sup>* . Using the so-called Borel hierarchy, it can be shown that any Borel set *A* ⊂ σ(*a*) can be constructed from open and closed sets in at most a countable number of steps, at each of which a countable union or intersection of sets from the previous steps is used. This gives 1*<sup>A</sup>* for any Borel set, and hence also yields the simple functions *s* = ∑*<sup>k</sup> ck*1*Ak* with *ck* ≥ 0. For arbitrary measurable *f* ≥ 0 (not necessarily bounded and not even necessarily finite) it is a standard result in measure theory that there is a sequence (*sn*) of simple functions such that *sn* \$ *f* : to wit, define

$$A\_{n,k} = \{ \mathbf{x} \in \sigma(a) \mid 2^{-n}k < f(\mathbf{x}) \le 2^{-n}(k+1) \};\tag{\mathbf{B}.310}$$

$$A\_n = \{ \mathbf{x} \in \sigma(a) \mid n < f(\mathbf{x}) < \infty \};\tag{\mathbf{B}.311}$$

$$s\_n = n \cdot 1\_{A\_n} + 2^{-n} \sum\_{k=1}^{2^n n - 1} k 1\_{A\_{n,k}}.\tag{B.312}$$

Relabeling the (at most) countable number of sequences thus obtained as a single sequence then gives a positive sequence (*hn*) in*C*(σ(*a*)) such that *hn* → *f* pointwise.

A final trick turns (*hn*) into a monotone increasing bounded sequence (*fn*): for *m* > *n*, define *fn*,*<sup>m</sup>* = min{*hn*,...,*hm*}, which is monotone *de*creasing in *m* and positive, and hence has a (pointwise) limit *fn* = lim*m*→<sup>∞</sup> *fn*,*m*. The ensuing sequence (*fn*) is monotone *in*creasing and still converges to *f* . If *f* is bounded (as we assume by definition of B(σ(*a*))), then (*fn*) must also be bounded eventually. -

If *f* ∈ B(σ(*a*)) and *fn* \$ *f* with *fn* ∈ *C*(σ(*a*)), we would like to define *f*(*a*) as lim*<sup>n</sup> fn*(*a*), just as in the case where *f* ∈ *C*(σ(*a*)) and *fn* ∈ *P*(σ(*a*)). However, in the former case convergence *fn* → *f* is merely pointwise, whereas in the latter case it was uniform, translated into norm convergence *fn*(*a*) → *f*(*a*). Pointwise convergence of functions, then, becomes *strong* convergence of operators:

Proposition B.98. *If* (*an*) *is a sequence of positive operators on H for which*

$$0 \le a\_1 \le \cdots \le a\_n \le a\_{n+1} \le \cdots \le c \mathbf{1}\_H,\tag{\mathbf{B.313}}$$

*where ai* ≤ *aj means that* ψ,*ai*ψ≤ψ,*aj*ψ *for each* ψ ∈ *H, then there exists a unique positive operator a such that an* \$ *a strongly, i.e., for each* ψ ∈ *H,*

$$a\Psi = \lim\_{n \to \infty} a\_n \Psi. \tag{B.314}$$

*Furthermore, a* = sup*<sup>n</sup> an with respect to the partial ordering* ≤ *on the set of positive bounded operators (that is, an* ≤ *a for each n, and if an* ≤ *b for each n, then a* ≤ *b).*

*Proof.* Recalling Proposition A.4, define a sequence of bounded quadratic forms *Qn* : *H* → R by *Qn*(ψ) = ψ,*an*ψ. Then (*Qn*(ψ)) is a monotone increasing bounded sequence for each ψ ∈ *H*, so that *Q*(ψ) = lim*n*→<sup>∞</sup> *Qn*(ψ) exists. Like each *Qn*, also *<sup>Q</sup>* satisfies (A.8) - (A.9). Since <sup>|</sup>*Qn*(ψ)| ≤ *<sup>c</sup>*ψ<sup>2</sup> and hence <sup>|</sup>*Q*(ψ)| ≤ *<sup>c</sup>*ψ2, it remains bounded. Hence (A.10) defines a bounded hermitian form *B*, upon which Proposition B.79 yields a bounded operator *a*, satisfying *B*(ϕ,ψ) = ϕ,*a*ψ. Since

$$
\langle \Psi, a\Psi \rangle = \lim\_{n \to \infty} \langle \Psi, a\_n \Psi \rangle,\tag{B.315}
$$

we have *a* ≥ 0. To prove (B.314), note that (B.315) gives ψ,(*a* − *an*)ψ → 0, but (B.313) implies *<sup>a</sup>* <sup>−</sup> *an* <sup>≥</sup> 0, so that *<sup>a</sup>* <sup>−</sup> *an* has a self-adjoint square root <sup>√</sup>*a*−*an*, defined by Theorem B.94 (see also Proposition B.99 below). Hence

$$<\langle \Psi, (a - a\_n)\Psi \rangle = \langle \sqrt{a - a\_n}\Psi, \sqrt{a - a\_n}\Psi \rangle = \|\sqrt{a - a\_n}\Psi\|^2 \to 0. \tag{B.316}$$

Now if a sequence of operators (*bn*) is such that *bn* ≤*C* for all *n*, and *bn*ψ → 0, then also *b*<sup>2</sup> *<sup>n</sup>*ψ → 0, for *b*<sup>2</sup> *<sup>n</sup>*ψ≤*bnbn*ψ ≤ *Cbn*ψ → 0. This applies here, since *am* ≤ *an* for *m* ≤ *n*, and hence *a*−*an* ≤ *a*−*am*, from which *a*−*an*≤*a*−*am* (see Lemma B.96). Fixing *m*, this gives *a* − *an* ≤ *C* with *C* = *a* − *am*, for all *n* ≥ *m*. So (B.316) implies (*a*−*an*)ψ → 0, which is (B.314).

As to the final claim, eq. (B.315) is the same as ψ,*a*ψ = sup*n*{ψ,*an*ψ}. -

In this proof, we used the following generalization of Proposition A.22:

Proposition B.99. *The following conditions on a* ∈ *B*(*H*) *are equivalent:*

*1.* ψ,*a*ψ ≥ 0 *for arbitrary* ψ ∈ *H; 2. a*<sup>∗</sup> <sup>=</sup> *a and* <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup>+*; 3. a* <sup>=</sup> *<sup>c</sup>*<sup>2</sup> *for some bounded self-adjoint operator c* <sup>∈</sup> *<sup>B</sup>*(*H*)*; 4. a* = *b*∗*b for some bounded operator b* ∈ *B*(*H*)*.*

*Proof.* The proof is the same as in the finite-dimensional case, except that:


Given some positive *f* ∈ B(σ(*a*)), we now use Lemma B.97 to find a monotone increasing bounded sequence (*fn*) in *C*(σ(*a*)) such that *fn* \$ *f* pointwise, and subsequently use Proposition B.98 to define *f*(*a*) as the *strong* limit

$$f(a)\Psi = \lim\_{n \to \infty} f\_n(a)\Psi \text{ (}\Psi \in H\text{)}.\tag{B.317}$$

Arbitrary functions *f* are then dealt with using (B.30) and performing the above constructing term-wise. This, then, yields *f*(*a*) for any *a*<sup>∗</sup> = *a* ∈ *B*(*H*) and *f* ∈ B(σ(*a*)).

It is natural to ask which corner of *B*(*H*) the operators *f*(*a*) land in when *f* ∈ B(σ(*a*)), much as we have shown that *f*(*a*) ∈*C*∗(*a*) for *f* ∈*C*(σ(*a*)). A safe choice would be *C*∗(*a*)−, i.e., the strong closure of *C*∗(*a*), which by definition contains all limits of all strongly convergent nets in *C*∗(*a*) (so that it certainly contains all limits (B.317)), and which is automatically a strongly closed unital ∗-algebra. This may seem too large, but if *H* is separable, it turns out to be the right choice, because these more general limits add nothing to (B.317)). For a more explicit description of *C*∗(*a*)<sup>−</sup> we need the *commutant S* of any *S* ⊂ *B*(*H*), which is defined by

$$S' = \{ a \in B(H) \mid ab = ba \forall b \in S \};\tag{B.318}$$

the *bicommutant* of *S* is *S* = (*S* ) . If *S*<sup>∗</sup> = *S*, in that *a* ∈ *S* iff *a*<sup>∗</sup> ∈ *S*, then *S* is easily seen to be a unital <sup>∗</sup>-algebra within *B*(*H*). Furthermore, it is obvious that *S* ⊂ *S*, so that the passage *S* → *S* is some sort of a closure operation within *B*(*H*), comparable to the closure operation *S* → *S*⊥⊥ within *H* itself. Indeed, there is a striking analogue of (B.204) at the operator level, due to von Neumann (see Theorem C.127):

Theorem B.100. *If A is a unital* ∗*-algebra in B*(*H*)*, then*

$$A'' = A^-,\tag{B.319}$$

*where A*− *is the strong closure of A in B*(*H*) *(which is automatically a* ∗*-algebra).*

Corollary B.101. *Denoting the strong closure C*∗(*a*)− *of C*∗(*a*) *by W*∗(*a*)*, we have*

$$W^\*(a) = C^\*(a)''.\tag{\mathbb{B}.320}$$

Though not obvious from (B.320), the alternative description through (B.319) shows that *W*∗(*a*) inherits the commutativity of *C*∗(*a*); in fact *W*∗(*a*) is a commutative C\* algebra, too. Moreover, by construction it is also a *von Neumann algebra* in that *W*∗(*a*) = *W*∗(*a*), cf. Appendix C. Such unital ∗-algebras in *B*(*H*) are not merely norm closed, but are also closed in at least three other natural topologies on *B*(*H*), including the strong one. The situation may be summarized in the *spectral theorem*: Theorem B.102. *Let a*<sup>∗</sup> = *a* ∈ *B*(*H*)*. The isomorphism C*(σ(*a*)) → *C*∗(*a*) *of Theorem B.94 has a unique extension to a homomorphism*

$$\beta \mathcal{B}(\sigma(a)) \to W^\*(a), \ a \mapsto f(a), \tag{B.321}$$

*for* (B.289) *-* (B.291) *continue to hold. In particular, the operator eA in* (B.305) *is a projection. Also, eq.* (B.304) *remains valid, and for each f* ∈ B(σ(*a*))*, one has*

$$\|f(a)\| \le \|f\|\_{\infty} \tag{\mathbb{B}.322}$$

*Proof.* The map *a* → *f*(*a*) is given by (B.317) and preceding discussion. Eqs. (B.289) and (B.291) easily follows by limiting arguments. Using the same trick as in the proof of Proposition B.98 it can be shown that *f*(*a*)<sup>2</sup> = *f*(*a*2), whence, using the identity *f g* = <sup>1</sup> <sup>2</sup> ((*<sup>f</sup>* <sup>+</sup> *<sup>g</sup>*)<sup>2</sup> <sup>−</sup> *<sup>f</sup>* <sup>2</sup> <sup>−</sup> *<sup>g</sup>*2), eq. (B.290) follows. This implies *e*2 *<sup>A</sup>* = 1<sup>2</sup> *<sup>A</sup>*(*a*) = 1*A*(*a*) = *eA*, whilst (B.291) gives *e*<sup>∗</sup> *<sup>A</sup>* = 1<sup>∗</sup> *<sup>A</sup>*(*a*) = 1*A*(*a*) = *eA*.

We prove (B.322) for *f* ≥ 0; this implies the general case by (B.30) and the triangle equality. Writing *H*<sup>1</sup> for the set of unit vectors in *H*, approximating *fn* \$ *f* , repeatedly using (B.300), the property *f*(*a*) = sup*<sup>n</sup> fn*(*a*) established at the end of Proposition B.98, and finally using (B.292) for each *fn* ∈*C*(σ(*a*)), we may estimate:

$$\begin{aligned} \|f(a)\| &= \sup\_{\Psi \in H\_1} \{ |\langle \Psi, f(a)\Psi \rangle| \} \\ &= \sup\_{\Psi \in H\_1} \sup\_{n \in \mathbb{N}} \{ |\langle \Psi, f\_n(a)\Psi \rangle| \} \\ &= \sup\_{n \in \mathbb{N}} \sup\_{\Psi \in H\_1} \{ |\langle \Psi, f\_n(a)\Psi \rangle| \} \\ &= \sup\_{n \in \mathbb{N}} ||f\_n(a)|| = \sup\_{n \in \mathbb{N}} ||f\_n||\_{\infty} \\ &\le ||f||\_{\infty}, \end{aligned} \tag{B.323}$$

where the last inequality is a trivial consequence of the specific limit *fn* \$ *f* .

Finally, our motivating identity (B.304) follows from the same equality for each *fn* ∈ *C*(σ(*a*)), upon which Lebesgue's Monotone Convergence Theorem yields the right-hand side, whereas (B.315) gives the left-hand side. -

Of course, in finite dimension, Theorem B.102 coincides with Theorems A.15 and Theorem B.94. Theorem A.15 implies Theorem A.10 through (A.58) - (A.59), and, as we will now explain, in infinite dimension Theorem B.102 similarly implies a certain approximate version of Theorem A.10, namely Corollary B.104.

Lemma B.103. *If K* ⊂ R *is compact, any f* ∈*C*(*K*) *may be uniformly approximated by simple functions. More precisely, for each* ε > 0 *there is a decomposition K* = *n <sup>i</sup>*=<sup>1</sup> *Ai of K as a disjoint union of n* < ∞ *Borel sets Ai, such that for any xi* ∈ *Ai,*

$$\left\| f - \sum\_{i=1}^{n} f(\mathbf{x}\_{i}) \mathbf{1}\_{A\_{i}} \right\|\_{\infty} < \mathfrak{e}. \tag{\text{B.324}}$$

*Proof.* Since *K* is compact, *f* is actually uniformly continuous on *K*. This means that for ε > 0 there is δ > 0 such that | *f*(*x*)− *f*(*y*)| < ε whenever |*x*−*y*| < δ. Since (B.324) just states that | *f*(*x*)− *f*(*xi*)| < ε for each *i* = 1,...,*n* and each *x* ∈ *Ai*, any partition for which 0 < |*Ai*| < δ will do (where |*A*| = sup{|*x*−*y*|, *x*, *y* ∈ *A*}). -

From (B.305), Lemma B.103, and Theorem B.102, we then immediately have:

Corollary B.104. *Let a*<sup>∗</sup> = *a* ∈ *B*(*H*)*. For any f* ∈ *C*(σ(*a*)) *and any* ε > 0*, there is a partition* σ(*a*) = *<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *Ai of* σ(*a*) *as a disjoint union of n* < ∞ *Borel sets Ai, such that for arbitrary* λ*<sup>i</sup>* ∈ *Ai, one has*

$$\left\| f(a) - \sum\_{i=1}^{n} f(\lambda\_i) e\_{A\_i} \right\| < \varepsilon. \tag{B.325}$$

*In particular, for f*(*x*) = *x and f*(*x*) = 1 *we have*

$$\left\| a - \sum\_{i=1}^{n} \lambda\_i e\_{A\_i} \right\| < \varepsilon;\tag{B.326}$$

$$\left\| 1\_H - \sum\_{i=1}^n e\_{A\_i} \right\| < \varepsilon. \tag{B.327}$$

*If a has discrete spectrum* σ(*a*) = σ*p*(*a*)*), then* (B.326) *-* (B.327) *reduce to* (A.37) *-* (A.38)*, where e*<sup>λ</sup> *is defined by* (B.307)*, and the sums converge in norm.*

Hence in this version of the spectral theorem, one approximates *a* by linear combinations of projections in a way that reflects the approximation of the identity function *x* → *x* on σ(*a*) by simple functions. Eq. (B.326) is often symbolically written as

$$a = \int\_{\sigma(a)} de\_{\lambda} \,\lambda \,, \tag{B.328}$$

which may also be given some direct meaning as an operator-valued Stieltjes integral, but even so, this neat expression eventually boils down to (B.326) itself.

Corollary B.105. *Let* <sup>P</sup>(*A*) = {*<sup>e</sup>* <sup>∈</sup> *<sup>A</sup>* <sup>|</sup> *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*}*, where A is a von Neumann algebra. Then A is the norm-closure of the linear span of* P(*A*)*, and*

$$A = \mathcal{P}(A)''. \tag{\text{B.329}}$$

*Proof.* The first claim follows from Corollary B.104. This implies (B.329), which may also be proved directly: since P(*A*) ⊂ *A*, the inclusion P(*A*) ⊆ *A* = *A* is obvious. Conversely, let *a* ∈ *A* and assume *a*<sup>∗</sup> = *a* (if not, decompose *a* = *a* +*ia* with *a* and *a* self-adjoint). Then *W*∗(*a*) ⊂ *A*, so that *A* contains all spectral projections of *a*, cf. Theorem B.102. Moreover, by Corollary B.104, *a* lies in the norm-closure of the linear span of P(*A*), which by Theorem B.100 in turn is contained in *A*. -

#### B.16 Abelian ∗-algebras in *B*(*H*)

Compared with Theorem B.94, it seems a weakness of Theorem B.102 that the map *f* → *f*(*a*) fails to be an isomorphism from B(σ(*a*)) to *W*∗(*a*). The reason is that although the map is surjective (at least when *H* is separable), it fails to be injective: for real-valued *f* one has *f*(*a*) = 0 iff ψ, *f*(*a*)ψ = 0 for all ψ ∈ *H*, which by (B.304) is the case iff <sup>σ</sup>(*a*) *d*μψ *f* = 0 for all unit vectors ψ ∈ *H*, which in turn is the case iff *f* = 0 a.e. with respect to μψ, in other words, iff *f* = 0 in *L*∞(σ(*a*),μψ).

Thus the right kind of algebra to be isomorphic to *W*∗(*a*) is *L*∞(σ(*a*),μ) rather than B(σ(*a*)), where μ is some (probability) measure on σ(*a*) such that μ(*A*) = 0 iff μψ(*A*) = 0 for all unit vectors ψ ∈ *H*. Indeed, in that case, since by construction

$$L^{\infty}(\sigma(a),\mu) \cong \mathcal{A}(\sigma(a)) / \{ f \mid f = 0 \,\mu\text{-a.e.} \} = \mathcal{A}(\sigma(a)) / \ker(f \mapsto f(a)), \text{ (B.330)}$$

our map B(σ(*a*)) →*W*∗(*a*) descends to an isomorphism of von Neumann algebras:

$$L^{\infty}(\sigma(a),\mu) \stackrel{\cong}{\to} W^\*(a). \tag{\mathbb{B}.331}$$

This is quite nontrivial; let us first present a case study where everything is clear.

Proposition B.106. *Let H* = *L*2(0,1) = *L*2([0,1]) *(with Lebesque measure), and let a* = *m*id ∈ *B*(*H*) *(where* id(*x*) = *x) be the self-adjoint position operator*

$$a\Psi(\mathbf{x}) = \mathbf{x}\Psi(\mathbf{x}).\tag{\mathbf{B.332}}$$

*Then the map f* → *f*(*a*) *in both Theorems B.94 and B.102 is given by*

$$f(a) = m\_f,\tag{\text{B.333}}$$

*cf. Proposition B.73. The two* ∗*-algebras in B*(*H*) *defined by a are given by*

$$C^\*(a) = C([0, 1]);\tag{\mathbb{B}.334}$$

$$W^\*(a) = L^\infty(0, 1),\tag{\mathbb{B}.335}$$

*both realized as multplication operators (i.e., identifying f with mf). Furthermore,*

$$L^{\infty}(0,1)^{\prime} = L^{\infty}(0,1). \tag{\mathbb{B}.336}$$

*More generally, let K* ⊂ R *be compact, let* μ *be a regular probability measure on K with support K, take H* = *L*2(*K*,μ) *and the define a as in* (B.332)*. Then:*

$$
\sigma(a) = K;\tag{\text{B.337}}
$$

$$C^\*(a) = C(K);\tag{\mathbb{B}.338}$$

$$W^\*(a) = L^\infty(K, \mu);\tag{\mathbb{B}.339}$$

$$f(a) = m\_f;\tag{\text{B.340}}$$

$$L^{\infty}(K,\mu)^{\prime} = L^{\infty}(K,\mu). \tag{\mathbb{B}.341}$$

*Proof.* We just prove the case *K* = [0,1] with *d*μ(*x*) = *dx*; the general case is similar.

Eq. (B.333) is obvious for polynomials *f* , and otherwise follows from easy limiting arguments. Consequently, eq. (B.334) is an instance of Theorem B.94. Everything else then follows if we can prove that

$$C([0,1])' = L^{\infty}(0,1). \tag{B.342}$$

Namely, assuming (B.342), since *<sup>C</sup>*([0,1]) <sup>⊂</sup> *<sup>L</sup>*∞(0,1) (and *<sup>A</sup>* <sup>⊆</sup> *<sup>B</sup>* implies *<sup>B</sup>* <sup>⊆</sup> *A* ), we automatically have *<sup>L</sup>*∞(0,1) <sup>⊆</sup> *<sup>C</sup>*([0,1]) , so (B.342) implies *<sup>L</sup>*∞(0,1) <sup>⊆</sup> *L*∞(0,1), and since the converse inclusion is trivial from commutativity of *L*∞(0,1), eq. (B.342) implies (B.336). Furthermore, since *W*∗(*a*) =*C*([0,1]), taking the commutant of (B.342) and applying (B.336) yields (B.335).

So let us prove (B.342). The inclusion *<sup>L</sup>*∞(0,1) <sup>⊆</sup> *<sup>C</sup>*([0,1]) is obvious, since *mf mg* = *mf g* = *mg f* = *mgmf* , so we need to prove the converse. Take *b* ∈ *C*([0,1]) and define *<sup>f</sup>* <sup>=</sup> *<sup>b</sup>*1[0,1] <sup>∈</sup> *<sup>L</sup>*2(0,1). For <sup>ψ</sup> <sup>∈</sup> *<sup>C</sup>*([0,1]) <sup>⊂</sup> *<sup>L</sup>*2(0,1), we have

$$b\Psi = b m\_{\Psi} 1\_{[0,1]} = m\_{\Psi} b 1\_{[0,1]} = m\_{\Psi} f = m\_{\Psi} m\_f 1\_{[0,1]} = m\_f m\_{\Psi} 1\_{[0,1]} = m\_f \Psi,\tag{\text{B.343}}$$

so *<sup>b</sup>* <sup>=</sup> *mf* on the dense domain *<sup>C</sup>*([0,1]) <sup>⊂</sup> *<sup>L</sup>*2(0,1), with *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*2(0,1). Now *<sup>b</sup>* is bounded by definition of the commutant *C*([0,1]) and hence *mf* < ∞. If *f* ∈/ *L*∞(0,1), the proof of Proposition B.73 gives that *Xt* has positive measure for each *<sup>t</sup>* <sup>&</sup>gt; 0, whence *mf* ≥ *<sup>t</sup>* for all *<sup>t</sup>*, which is a contradiction. Hence *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*∞(0,1), in which case *mf* extends to all of *L*2(0,1) by continuity. This extension must equal *b*, so that *<sup>b</sup>* <sup>=</sup> *mf* , and hence *<sup>C</sup>*([0,1]) <sup>⊆</sup> *<sup>L</sup>*∞(0,1). -

The following variation on this example turns out to be qualitatively different:

Proposition B.107. *Realizing* -<sup>∞</sup>(N) *as multiplication operators on* -<sup>2</sup>(N)*, one has*

$$
\ell^{\infty}(\mathbb{N})' = \ell^{\infty}(\mathbb{N}).\tag{\mathbb{B}.344}
$$

*Proof.* For each *N* ∈ N, we define a finite-dimensional subspace -<sup>2</sup>(*N*) <sup>⊂</sup> -<sup>2</sup>(N) by

$$\ell^2(N) = \{ \Psi \in \ell^2(\mathbb{N}) \mid \Psi(\mathbf{x}) = \mathbf{0} \,\forall \mathbf{x} > N \},$$

with ensuing projection 1*<sup>N</sup>* : -<sup>2</sup>(N) <sup>→</sup> -<sup>2</sup>(*N*), i.e., 1*N*ψ(*x*) = <sup>ψ</sup>(*x*) for *<sup>x</sup>* <sup>≤</sup> *<sup>N</sup>* and 1*N*ψ(*x*) = 0 for *x* > *N*. If *b* ∈ -<sup>∞</sup>(N) , we have *b* : -<sup>2</sup>(*N*) <sup>→</sup> -<sup>2</sup>(*N*), because 1*<sup>N</sup>* <sup>∈</sup> -<sup>∞</sup>(N) (and hence <sup>ψ</sup> <sup>∈</sup> -<sup>2</sup>(*N*), i.e., 1*N*<sup>ψ</sup> <sup>=</sup> <sup>ψ</sup>, implies *<sup>b</sup>*<sup>ψ</sup> <sup>∈</sup> -<sup>2</sup>(*N*), i.e., 1*Nb*ψ = *b*ψ). With *fN* : N → C given by *fN* = *b*1*N*, define *f* : N → C by *f*(*x*) = *fN*(*x*) for any *N* > *x*; this is well defined, in that if *x* < *N* < *M*, then *fN*(*x*) = *fM*(*x*). For any *N* and ψ ∈ -<sup>2</sup>(*N*), as in (B.343) we have *b*ψ = *mf*ψ, which therefore holds on a dense subspace ∪*N*-<sup>2</sup>(*N*) of -<sup>2</sup>(N). Again as in the previous proof, this gives

$$\|f\|\_{\infty} = \|m\_f\| = \|b\| < \infty,\tag{B.345}$$

i.e., *f* ∈ -<sup>∞</sup>(N). Thus *<sup>b</sup>* <sup>=</sup> *mf* <sup>≡</sup> *<sup>f</sup>* <sup>∈</sup> -<sup>∞</sup>(N), whence -<sup>∞</sup>(N) <sup>⊆</sup> -<sup>∞</sup>(N). With the trivial opposite inclusion, this gives (B.344). - Note that since a possible (discrete) position operator (B.332) would be unbounded on -<sup>2</sup>(N), a possible counterpart to (B.335), although it exists, would blast the framework of the this section (cf. §B.21). See, however, the proof of Theorem B.118.

More generally, we have:

Proposition B.108. *Let* (*X*,Σ,μ) *be a* σ*-finite Borel space and realize L*∞(*X*,μ) *as multplication operators on L*2(*X*,μ)*. Then*

$$L^{\infty}(X,\mu)' = L^{\infty}(X,\mu). \tag{\mathbb{B}.346}$$

*Proof.* Writing *X* = ∪*N*∈<sup>N</sup>*XN* with μ(*XN*) < ∞, which holds by virtue of σ-finiteness, the proof is practically the same as for *<sup>X</sup>* <sup>=</sup> <sup>N</sup> (except for the fact that *<sup>L</sup>*2(*XN*) <sup>⊂</sup> *L*2(*X*) need not be finite-dimensional, but it is closed, which suffices). -

If *A* ⊂ *B*(*H*) is a commutative <sup>∗</sup>-algebra, we say that *A* is *maximal (abelian)* if *A* ⊆ *B* ⊂ *B*(*H*) for some commutative <sup>∗</sup>-algebra *B* implies *B* = *A*. Any <sup>∗</sup>-algebra *A* ⊂ *B*(*H*) is abelian iff *A* ⊆ *A* (this is trivial), and is maximally abelian iff *A* = *A*. To see the nontrivial "⇒" direction, for any subsets *C* ⊂ *B*(*H*) and *D* ⊂ *B*(*H*) the inclusion *C* ⊆ *D* implies *D* ⊆ *C* (as is immediate from the definition of the commutant), so *B* ⊆ *A* . Since *B* is commutative, we also know that *B* ⊆ *B* , whence *B* ⊆ *A* . If *A* = *A* this gives *B* ⊆ *A*, so *B* = *A*. The condition *A* = *A*, in turn, implies *A* = *A*, i.e., any maximal abelian ∗-algebra *A* in *B*(*H*) is automatically a von Neumann algebra.

Corollary B.109. *In the setting of Proposition B.108, L*∞(*X*,μ)*is a maximal abelian* <sup>∗</sup>*-algebra in B*(*L*2(*X*,μ))*, and hence a von Neumann algebra. In particular:*


The above examples suggest a neat reformulation of the spectral theorem. This requires a few more concepts from the theory of operator algebras, cf. Appendix C.

Definition B.110. *For any* <sup>∗</sup>*-algebra A* ⊂ *B*(*H*) *and* ψ ∈ *H, we write A*ψ<sup>−</sup> ⊆ *H for the closure of the linear subspace of all vectors a*ψ*, a* ∈ *A. We say that* ψ *(*= 0*) is:*


*If a*<sup>∗</sup> = *a* ∈ *B*(*H*)*, we similarly say that* ψ *is cyclic (separating) for a if* ψ *is cyclic (separating) for A* = *C*∗(*a*)*, or, equivalently, for A* = *W*∗(*a*)*.*

The equivalence of the two ways of writing the last definition follows from the relation *W*∗(*a*)ψ− = *C*∗(*a*)ψ−, cf. Corollary B.101; more generally, ψ is cyclic (separating) for *A* iff it is cyclic (separating) for its strong closure *A*−.

For example, if *A* = *B*(*H*), any vector is cyclic for *A*, and none is separating. On the other hand, if *A* = C · 1*H*, then no vector is cyclic for *A* and all vectors are separating. If *H* = *L*2(*X*,μ) on some finite measure space, then ψ = 1*<sup>X</sup>* is cyclic as well as separating for *A* = *L*∞(*X*,μ). Noting (B.346), as well as the property *B*(*H*) = C· 1*H*, these examples illustrates a general phenomenon:

Lemma B.111. *If* 1*<sup>H</sup>* ∈ *A, a vector* ψ *is cyclic for A iff it is separating for A , and* vice versa*. In particular, if A* = *A, then* ψ *is cyclic for A iff it is separating for A. If A is abelian, then every vector that is cyclic for A is also separating for A.*

*Proof.* If *A*ψ<sup>−</sup> = *H* and *b*ψ = 0 for *b* ∈ *A* , then *ba*ψ = 0 for each *a* ∈ *A* and hence *b* vanishes on a dense subspace of *H*. Since *b* is bounded, *b* = 0. Conversely, let *e* be the projection onto *A*ψ−; then *e* ∈ *A* and hence 1*<sup>H</sup>* −*e* ∈ *A* . Since 1*<sup>H</sup>* ∈ *A* we have ψ ∈ *A*ψ<sup>−</sup> and hence *e*ψ = ψ, whence (1*<sup>H</sup>* −*e*)ψ = 0. If ψ is separating for *A* , this implies *e* = 1*<sup>H</sup>* and hence *A*ψ<sup>−</sup> = *H*. Finally, *A* is abelian iff *A* ⊆ *A* . -

Theorem B.112. *Let a*<sup>∗</sup> = *a* ∈ *B*(*H*)*, and suppose some unit vector* ψ ∈ *H is cyclic for a. Then a is unitarily equivalent to the position operator* (B.332) *on L*2(σ(*a*),μψ)*, where the probability measure* μψ *on* σ(*a*) *is given by* (B.304)*. Furthermore, through the unitary operator u* : *<sup>H</sup>* <sup>→</sup> *<sup>L</sup>*2(σ(*a*),μψ) *in question we have*

$$
\mu f(a)\mu^{-1} = f;\tag{\mathbb{B}.347}
$$

$$
\mu \mathcal{C}^\*(a) \mu^{-1} = \mathcal{C}(\sigma(a));\tag{B.348}
$$

$$
\mu W^\*(a) \mu^{-1} = L^\infty(\sigma(a), \mu\_\Psi), \tag{B.349}
$$

*all of which being realized as multiplication operators on L*2(σ(*a*),μψ)*.*

*Moreover, L*∞(σ(*a*),μψ) *is maximally abelian, and hence satisfies*

$$L^{\infty}(\sigma(a), \mu\_{\Psi}) = L^{\infty}(\sigma(a), \mu\_{\Psi})'. \tag{B.350}$$

*Proof.* First, define *u* on a dense subspace of *H* by

$$
\mu: \mathcal{C}^\*(a)\Psi \to L^2(\sigma(a), \mu\_\Psi); \tag{B.351}
$$

$$
\mu f(a)\Psi = f,\ f \in \mathcal{C}(\sigma(a)).\tag{\mathbb{B}.\mathcal{FS}2}
$$

It follows from (B.289) - (B.291) and (B.304) that *f*(*a*)ψ*<sup>H</sup>* = *f* 2, which makes *u* well defined (since *f*(*a*)ψ = *g*(*a*)ψ implies *f* = *g*), as well as isometric. In particular, *u* is bounded, and hence it can be extended from *C*∗(*a*)ψ to *H* by continuity. This extension is surjective, since *C*(σ(*a*)) is dense in *L*2(σ(*a*),μψ), and therefore *<sup>u</sup>* : *<sup>H</sup>* <sup>→</sup> *<sup>L</sup>*2(σ(*a*),μψ) is unitary. Then (B.347) - (B.348) hold by construction; the special case *f* = id yields (B.332). As in Proposition B.106, we obtain *C*(σ(*a*)) = *L*∞(σ(*a*),μψ), which implies (B.349) - (B.350). -

Note that this proposition implies that *H* is separable. When does a self-adjoint (or normal) operator *a* have a cyclic vector? To practice, we first look at *H* = C*n*.

Proposition B.113. *Let H* = C*<sup>n</sup> and let a* = diag(λ1,...,λ*n*) *be a diagonal matrix. Then the following properties are equivalent:*


*Proof.* We first show that all λ*<sup>i</sup>* are distinct iff

$$C^\*(a) = D\_n(\mathbb{C}),\tag{\mathbb{B}.353}$$

i.e., the set of all diagonal matrices. To see this, first note that for any *f* : σ(*a*) → C (and any such function is continuous, since σ(*a*) is a finite subset of C) we have

$$f(\text{diag}(\mathbb{A}\_1, \dots, \mathbb{A}\_n)) = \text{diag}(f(\mathbb{A}\_1), \dots, f(\mathbb{A}\_n));\tag{B.354}$$

this is true by computation for polynomials in *a*, and these exhaust all functions on σ(*a*). It follows that *C*∗(*a*) ⊆ *Dn*(C). We know from (A.49) that *C*∗(*a*) ∼= *C*(σ(*a*)) whether or not σ(*a*) is non-degenerate, and since dim(*C*(σ(*a*))) = |σ(*a*)| (i.e., the number of elements of σ(*a*)), we obtain

$$\dim(\mathcal{C}^\*(a)) = |\sigma(a)|. \tag{B.355}$$

So if *a* is non-degenerate, noting that dim(*Dn*(C)) = *n* we must have (B.353). If, on the other hand, *a* is degenerate, we have |σ(*a*)| = *m* < *n*, so that also dim(*C*(σ(*a*))) = *m* < *n* and *C*∗(*a*) ⊂ *Dn*(C) is a strict inclusion. Furthermore, by direct computation or as a special case of Proposition B.108, we have

$$D\_n(\mathbb{C})' = D\_n(\mathbb{C}).\tag{\mathbb{B}.356}$$

To prove 1 → 2, take the cyclic vector to be

$$\Psi = (1, \ldots, 1) / \sqrt{n};\tag{\text{B.357}}$$

indeed, any vector (*z*1,...,*zn*) is equal to <sup>√</sup>*<sup>n</sup>* · diag(*z*1,...,*zn*)ψ, and we have diag(*z*1,...,*zn*) ∈ *Dn*(C) = *C*∗(*a*) by (B.353). For 2 → 1, if *H* has a cyclic vector ψ for *a*, then by definition *C*∗(*a*)ψ = C*n*, so that dim(*C*∗(*a*)ψ) = *n*. But also

$$\dim(C^\*(a)\Psi) \le \dim(C^\*(a)),\tag{B.358}$$

whether or not ψ is cyclic for *a*. If ψ is cyclic this gives

$$n \le \dim(\mathcal{C}^\*(a)) \le n \tag{B.359}$$

by (B.355), so that dim(*C*∗(*a*)) = *n*, whence |σ(*a*)| = *n* by (B.355).

Given this, the implication 1 → 3 follows from (B.356), whilst 3 → 4 follows from Theorem A.21. Finally, we prove 4 → 1: we already know that*C*∗(*a*) ⊂ *Dn*(C), and by (B.356) and the above argument it follows that *Dn*(C)is maximal. So if*C*∗(*a*) is maximal, then *C*∗(*a*) = *Dn*(C), and we already know from the first stage of the proof that this is equivalent to *a* being non-degenerate. -

With slightly more effort, an analogous result holds for general Hilbert spaces.

Proposition B.114. *A self-adjoint operator a on a separable Hilbert space H has a cyclic vector iff W*∗(*a*) *is maximal abelian (i.e., W*∗(*a*) = *W*∗(*a*)*).*

In other words, *a* has a cyclic vector iff*C*∗(*a*) =*C*∗(*a*), cf. (B.320). As we have just seen, if dim(*H*) < ∞, this is the case iff *a* is non-degenerate. Consistent with (B.349) (with *u* = 1) and (B.350), the position operator (B.332) acting on the Hilbert space *L*2(σ(*a*),μψ) is maximal in this sense, with ψ = 1σ(*a*) as a cyclic unit vector.

*Proof.* If ψ is cyclic for *a*, then (B.349) and (B.350) (along with the self-evident property *uA u*−<sup>1</sup> = (*uAu*−1) ) yield *W*∗(*a*) = *W*∗(*a*). Conversely, for any ∗-algebra *A* ⊂ *B*(*H*), one can find unit vectors (ψ*i*) such that *H* = ⊕*iHi* with *Hi* = *A*ψ<sup>−</sup> *<sup>i</sup>* : start with any ψ1, then take any ψ<sup>2</sup> ∈ (*A*ψ<sup>−</sup> <sup>1</sup> )<sup>⊥</sup> (in case this is nonzero, otherwise one was already done), etc. To show that this procedure terminates, Zorn's Lemma must be invoked (take the collection of all sets (*Hi*) of mutually orthogonal *A*-stable subspaces *Hi* <sup>⊂</sup> *<sup>H</sup>* that contain a cyclic vector for *<sup>A</sup>*). Then <sup>ψ</sup> <sup>=</sup> <sup>∑</sup>*<sup>n</sup>* <sup>2</sup>−*n*ψ*<sup>n</sup>* is clearly separating for *A*. If *A* = *A*, then ψ is also cyclic for *A*; cf. Lemma B.111. -

Thus we call a self-adjoint operator *a* ∈ *B*(*H*) *maximal* if it has a cyclic vector.

Corollary B.115. *A maximal self-adjoint operator a* ∈ *B*(*H*) *is unitarily equivalent to the position operator* (B.332) *on L*2(σ(*a*),μ)*, where* μ *is an appropriate probability measure on the spectrum* σ(*a*) ⊂ R*. Moreover, the map* B(σ(*a*)) →*W*∗(*a*) *in* (B.321) *induces an isomorphism* (B.331) *of von Neumann algebras.*

*Proof.* Take μ = μψ, cf. (B.304), where ψ is cyclic (or, equivalently, separating) for *a*. The map *f* → *f*(*a*) from B(σ(*a*)) to *W*∗(*a*) described in Theorem B.102 can be propelled further by conjugation with the unitary *u* of Theorem B.112, that is,

$$f \mapsto f(a) \mapsto \iota f(a)\iota^{-1} = m\_f;\tag{\mathbb{B}.360}$$

$$\mathcal{AB}(\sigma(a)) \to \mathcal{B}(H) \to \mathcal{B}(L^2(\sigma(a), \mu\_{\Psi})),\tag{B.361}$$

where the final equality in (B.360) follows from the computation

$$
\mu f(a)u^{-1}\mathbf{g} = \mu f(a)\mathbf{g}(a)\boldsymbol{\Psi} = \mu(f\cdot\mathbf{g})(a)\boldsymbol{\Psi} = f\mathbf{g} = m\_f\mathbf{g},\tag{\mathbf{B}.362}
$$

where for simplicity *<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*(σ(*a*)) <sup>⊂</sup> *<sup>L</sup>*2(σ(*a*),μψ), the inclusion being dense. The claim then immediately follows from (B.349). -

If *a* is not maximal, we can still prove a weaker version of Theorem B.112, which is sometimes seen as the ultimate version of the spectral theorem. To justify this view, take *<sup>H</sup>* <sup>=</sup> <sup>C</sup>*<sup>n</sup>* and let *<sup>a</sup>* <sup>∈</sup> *Mn*(C) be self-adjoint (or, more generally, normal). By Theorem A.10, *H* has a basis (υ*i*) of eigenvectors of *a*, with *a*υ*<sup>i</sup>* = λ*i*υ*i*. This yields a unitary map *H* → -<sup>2</sup>(*n*), where *<sup>n</sup>* <sup>=</sup> {1,2,...,*n*}, defined by *<sup>u</sup>*υ*<sup>i</sup>* <sup>=</sup> <sup>δ</sup>*<sup>i</sup>* (where <sup>δ</sup>*i*(*j*) = <sup>δ</sup>*i j*, as usual). It is easy to check that *uau*−<sup>1</sup> <sup>=</sup> *<sup>m</sup>*<sup>λ</sup> , where <sup>λ</sup> : *<sup>n</sup>* <sup>→</sup> <sup>C</sup> is defined by λ(*i*) = λ*i*, and *m*λψ = λψ, again as usual. In other words, *a* is unitarily equivalent to a multiplication operator (whose precise nature is left unspecified). Conversely, each multiplication operator *mf* on some *L*2(*X*,μ) is normal, and is self-adjoint if the function *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*∞(*X*,μ) is real-valued (μ-almost everywhere).

Theorem B.116. *Any bounded self-adjoint (more generally, normal) operator on a separable Hilbert space is unitarily equivalent to a multiplication operator.*

*Proof.* As in the proof of Theorem B.114, decompose *H* = ⊕*i*∈*IHi*, where each *Hi* contains some take some separating vector ψ*<sup>i</sup>* for *a*. Applying the proof of Theorem B.112 to each *Hi* then yields unitary isomorphisms *Hi* <sup>∼</sup><sup>=</sup> *<sup>L</sup>*2(σ(*a*),μ*i*), with <sup>μ</sup>*<sup>i</sup>* <sup>≡</sup> μψ*<sup>i</sup>* , from which, taking direct sums, we obtain a further unitary isomorphism

$$H \cong \bigoplus\_{i \in I} L^2(\sigma(a), \mu\_i). \tag{B.363}$$

Now take the disjoint union *X* ≡ '*i*∈*I*σ(*a*), i.e., *X* = ∪*i*∈*IXi*, where *Xi* = σ(*a*)× {*i*}, endowed with the σ-finite measure μ = ∑*<sup>i</sup>* μ*<sup>i</sup>* (so that if *A* ⊂ *X* is given by *A* = ∪*iAi* with *Ai* ⊂ *Xi*, we have μ(*A*) = ∑*<sup>i</sup>* μ*i*(*Ai*)). This gives a second isomorphism

$$\bigoplus\_{i} L^{2}(\sigma(a), \mu\_{i}) \cong L^{2}(X, \mu), \tag{B.364}$$

defined by mapping <sup>ϕ</sup>*<sup>j</sup>* <sup>∈</sup> *<sup>L</sup>*2(σ(*a*),μ*j*) to the same function on *Xj*, extended to *<sup>X</sup>* by putting it zero on all other *Xi*, *i* = *j*. This map is obviously unitary. By Theorem B.112, the isomorphism (B.363) maps the operator *a* to a direct sum ⊕*im*idσ(*a*) of multiplication operators, upon which the second isomorphism (B.364) maps this direct sum to a (single) multiplication operator *mq*, where the function *q* : *X* → C is defined by *q*(*x*,*i*) = *x* (in which (*x*,*i*) ∈ *Xi* ⊂ *X*, so that *x* ∈ σ(*a*) ⊂ C). -

More generally, the operator *f*(*a*) on *H*, for some *f* ∈ B(σ(*a*)), is first mapped to ⊕*imfi* , where *fi* is the image of *f* in *L*∞(σ(*a*),μ*i*) in the obvious way, which in turn is mapped to a multiplication operator *<sup>m</sup>* <sup>ˆ</sup>*<sup>f</sup>* , where <sup>ˆ</sup>*f*(*x*,*i*) = *<sup>f</sup>*(*x*), analogously to the position operator *<sup>q</sup>* <sup>=</sup> id.σ(*a*) above. This leads to an isomorphism *<sup>W</sup>*∗(*a*) <sup>∼</sup><sup>=</sup> *<sup>L</sup>*∞(*X*,μ), which, by the same reasoning as in the proof of Corollary B.115, also induces an isomorphism (B.331) of von Neumann algebras. See also Theorem C.140.

Finally, proposition B.114 may be generalized, to which end (and also as a result of independent interest) we extend Corollary A.20 to the infinite-dimensional case:

Theorem B.117. *Let H be separable and let A* ⊂ *B*(*H*) *be an abelian von Neumann algebra. Then A* = *W*∗(*a*) *for some self-adjoint a* ∈ *B*(*H*)*, i.e., A is singly generated.*

*Proof.* Let P(*A*) be the set of all projections in *A*, and let ψ ∈ *H* be separating for *A* and hence cyclic for *A* (cf. Lemma B.111 and the proof of Proposition B.114). The ensuing subset P(*A*)ψ = {*e*ψ | *e* ∈ P(*A*)} may be uncountable, but since any subspace of a separable metric space is separable, there is a countable subset PN(*A*) = {*en*,*n* ∈ N} of P(*A*) such that PN(*A*)ψ is dense in P(*A*)ψ, i.e., for any *e* ∈ P(*A*) there is a subsequence *enk* in PN(*A*) such that lim*k*→<sup>∞</sup> *enk*ψ = *e*ψ. But since P(*A*) ⊂ *A* ⊆ *A* and *A* ψ− = *H*, this is true not only on ψ but on a dense set of vectors *a*ψ, *a* ∈ *A* , so that *enk* → *e* in the strong operator topology. Thus PN(*A*) is strongly dense in PN(*A*), and by (B.329) and Theorem B.100 we have

$$
\mathcal{O}^{\mathfrak{P}} \_{\mathbb{N}}(\mathcal{A}) '' = \mathcal{A}.\tag{\mathcal{B}.\mathfrak{H}}
\\
\mathcal{O}^{\mathfrak{H}} \_{\mathbb{N}}(\mathcal{A}) = \begin{cases}
\\
\end{cases}
$$

The self-adjoint operator that does the job is now given by von Neumann's formula

600 B Basic functional analysis

$$a = \sum\_{n} 3^{-n} (2e\_n - 1\_H). \tag{B.366}$$

To see this, let *C*∗(*en*,*n* ∈ N) ≡ *C*∗(*en*)*<sup>n</sup>* be the C\*-algebra generated by the projections *en*, so that by construction

$$(\mathcal{J}^{\mathfrak{J}}\_{\mathbb{N}}(A)^{\prime\prime} = \mathbb{C}^\*(e\_n)\_n^{\prime\prime}. \tag{B.367}$$

We will show that

$$\mathbf{C}^\*(a) = \mathbf{C}^\*(e\_n)\_n,\tag{\text{B.368}}$$

which combined with (B.320), (B.365) and (B.367) yields the desired conclusion:

$$A = \partial \mathcal{P}\_{\mathbb{N}}(A)'' = \mathcal{C}^\*(e\_n)\_n'' = \mathcal{C}^\*(a)'' = W^\*(a). \tag{B.369}$$

The simplest argument for (B.368) uses the Gelfand isomorphism

$$\mathcal{C}^\*(e\_n)\_n \cong \mathcal{C}(X) \tag{\mathbb{B}.370}$$

as commutative C\*-algebras, cf. Theorem C.8, where the set of characters

$$X = \{ \mathbf{x} : \mathbf{C}^\*(e\_n)\_n \to \mathbb{C} \mid \mathbf{x}(bc) = \mathbf{x}(b)\mathbf{x}(c), \mathbf{x}(1\_H) = 1 \} \tag{\mathbf{B.371}}$$

of *C*∗(*en*)*<sup>n</sup>* is equipped with the weakest topology that makes all maps

$$
\hat{b} \;:\; X \to \mathbb{C}; \tag{B.372}
$$

$$
\hat{b}(\mathbf{x}) = \mathbf{x}(b), \ b \in \mathbf{C}^\*(e\_n)\_n,\tag{\text{B.373}}
$$

continuous. This makes *X* a compact Hausdorff space, and the isomorphism (B.370) is given by the Gelfand transform *<sup>b</sup>* → *<sup>b</sup>*ˆ. Defining *sn* <sup>≡</sup> <sup>2</sup>*en* <sup>−</sup>1*H*, we have *sn* <sup>=</sup> 1, since *sn*ψ = ψ if ψ ∈ *enH* and *sn*ψ = −ψ if ψ ∈ (1*<sup>H</sup>* −*en*)*H* = (*enH*)⊥. The series (B.366) therefore converges absolutely in *B*(*H*), and hence converges, to some limit *a* ∈*C*∗(*en*)*n*. We claim that its Gelfand transform ˆ*a* ∈*C*(*X*) separates points of *X*, so that by the Stone-Weierstrass Theorem B.51, the ∗-algebra it generates is dense in *C*(*X*) (in its canonical sup-norm). Thus *a* likewise generates *C*∗(*en*)*n*, and the proof of Theorem is ready up to the proof of the above claim, which we now give.

First, note that since by definition *C*∗(*en*)*<sup>n</sup>* is generated by the projections *en*, so that by (B.371) (and the automatic continuity this implies, i.e., *x* ∈ *C*∗(*en*)<sup>∗</sup> *<sup>n</sup>*), each *x* ∈ *X* is determined by its values on all *en*. Therefore, for each pair *xi*, *xj* ∈ *X*, *i* = *j*, there must be some *n* ∈ N for which *xi*(*en*) = *x <sup>j</sup>*(*en*). Consequently, for each *i* = *j*, the set *Ni j* = {*n* ∈ N | *xi*(*en*) = *xj*(*en*)} is not empty; let *ni j* = min*Ni j*. Since for any projection *e* the corresponding function ˆ*e* can only take the values 0 or 1, each ˆ*sn* must take the values <sup>±</sup>1, so that, with ˆ*<sup>a</sup>* <sup>=</sup> <sup>∑</sup>*<sup>n</sup>* <sup>3</sup>−*ns*ˆ*n*, we have

$$\mathbb{E}\_{\frac{1}{2}}(\hat{a}(\mathbf{x}\_{l}) - \hat{a}(\mathbf{x}\_{j})) = \pm \mathfrak{Z}^{-n\_{lj}} + \sum\_{n \in N\_{lj}, n > n\_{lj}} \pm \mathfrak{Z}^{-n} \neq 0,\tag{\text{B.374}}$$

since whatever the signs, the sum is always smaller than the first term. -

#### B.17 Classification of maximal abelian ∗-algebras in *B*(*H*)

We now prove the following classification of maximal abelian ∗-algebras in *B*(*H*), which forms the basis of the Kadison–Singer Conjecture discussed in §2.6 and §4.3.

Theorem B.118. *If H is separable (and infinite-dimensional), and A* ⊂ *B*(*H*) *is a maximal abelian* ∗*-algebra, then A is unitarily equivalent to one of the following:*

*1. L*∞(0,1) <sup>⊂</sup> *<sup>B</sup>*(*L*2(0,1)) *(realized as multiplication operators); 2.* -<sup>∞</sup>(N) <sup>⊂</sup> *<sup>B</sup>*(-<sup>2</sup>(N)) *(*idem*); 3. L*∞(0,1)⊕-<sup>∞</sup>(N) <sup>⊂</sup> *<sup>B</sup>*(*L*2(0,1)⊕-<sup>2</sup>(N)) *(*idem*); 4. L*∞(0,1)⊕*Dn*(C) <sup>⊂</sup> *<sup>B</sup>*(*L*2(0,1)⊕C*n*)*, for some n* <sup>∈</sup> <sup>N</sup> *(*idem*),*

*and these possibilities are (mutually) unitarily inequivalent .*

The first claim means that there is a unitary operator *u* from *H* to, say, *L*2(0,1), such that the map *<sup>a</sup>* → *uau*−<sup>1</sup> from *<sup>B</sup>*(*H*) to *<sup>B</sup>*(*L*2(0,1)) restricts to *uAu*−<sup>1</sup> <sup>=</sup> *<sup>L</sup>*∞(0,1), so that *A* ∼= *L*∞(0,1) as both C\*-algebras and von Neumann algebras (and likewise for the other possibilities). The last claim, then, means that there is *no* unitary map from, say, *L*2(0,1) to -<sup>2</sup>(N) that similarly induces an isomorphism *L*∞(0,1) ∼= -<sup>∞</sup>(N).

*Proof.* We begin with the easy part, which is the last clause. The key notion to proving the claimed inequivalence is that of an *atomic projection* in a von Neumann algebra *M* ⊂ *B*(*H*). If we partially order projections on *H* by (cf. Theorem 2.50 and §C.21)

$$e \le f \text{ iff } eH \subseteq fH,\tag{B.375}$$

we say that *f* is atomic if *f* = 0, and 0 ≤ *e* ≤ *f* implies either *e* = 0 or *e* = *f* . This property is preserved under unitary equivalence: if *M* ⊂ *B*(*H*) and *N* ⊂ *B*(*H* ) and *<sup>N</sup>* <sup>=</sup> *uMu*−<sup>1</sup> for some unitary *<sup>u</sup>* : *<sup>H</sup>* <sup>→</sup> *<sup>H</sup>* (again in the sense that *<sup>a</sup>* → *uau*−<sup>1</sup> is an isomorphism *M* <sup>∼</sup><sup>=</sup> <sup>→</sup> *<sup>N</sup>*), then *<sup>f</sup>* is atomic in *<sup>M</sup>* iff *ufu*−<sup>1</sup> is atomic in *<sup>N</sup>*. The reason is that *<sup>a</sup>* → *uau*−<sup>1</sup> induces an isomorphism of the pertinent posets of projections in *<sup>M</sup>* and *N*, so that all order-theoretical notions are preserved under unitary equivalence.

In the case at hand, the projections are easy to classify:


Any unitary equivalence between two of the entries in the list would have to preserve this fine structure of projections, and hence cannot exist.

We now prove that the list in Theorem B.118 is exhaustive. According to Theorem B.117, we only need to look at abelian von Neumann algebras *A* = *W*∗(*a*), where *a* is maximal. According to Theorem B.112 and its Corollary B.115 (whilst noting that some unitary equivalence *a* ∼= *b* induces a unitary equivalence *W*∗(*a*) ∼= *W*∗(*b*)), we may further restrict our attention to the case where *a* is the position operator on *<sup>L</sup>*2(*K*,μ), where *<sup>K</sup>* <sup>=</sup> <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup> is compact and <sup>μ</sup> is a regular probability measure (here and in what follows, this is always meant with respect to the Borel structure inherited from R ⊃ *K*), with support equal to *K*, and hence

$$W^\*(a) = L^\infty(K, \mu) \subset \mathcal{B}(L^2(K, \mu)). \tag{B.376}$$

The final step is to further reduce the possibilities by exploiting equivalences.

Definition B.119. *Two measure spaces* (*X*,Σ,μ) *and* (*X* ,Σ ,μ ) *are:*

• equivalent *if there is a measurable bijection* ϕ : *X* → *X with measurable inverse, and the measures* ϕ∗μ *and* μ *on X are equivalent in the sense that* ϕ∗μ(*A* ) = 0 *iff* μ (*A* ) = 0 *for each A* ∈ Σ *. Here* ϕ∗μ *is the measure on* (*X* ,Σ ) *defined by*

$$
\langle \mathfrak{o}\_\* \mu(A') = \mu\left(\mathfrak{o}^{-1}(A')\right) \ (A' \in \Sigma'). \tag{B.377}
$$

• isomorphic *if there is a measurable bijection* ϕ : *X* → *X with measurable inverse, and* ϕ∗μ(*A* ) = μ (*A* ) *for each A* ∈ Σ *.*

The ambiguity of the notation ϕ−<sup>1</sup> in (B.377) is innocent: for general measurable maps <sup>ϕ</sup> : *<sup>X</sup>* <sup>→</sup> *<sup>X</sup>* the set <sup>ϕ</sup>−1(*<sup>A</sup>* ) can only denote the pre-image {*x* ∈ *X* | ϕ(*x*) ∈ *A* }, whereas for invertible maps one might construe ϕ−1(*A* ) as {ϕ−1(*<sup>x</sup>* ) | *x* ∈ *A* }, where ϕ−<sup>1</sup> is the theoretic inverse ϕ−<sup>1</sup> of ϕ. Of course, these sets duly coincide.

Lemma B.120. *Let K and K be compact subsets of* R*, with* Σ *and* Σ *the Borel structures inherited from* R ⊃ *K and* R ⊃ *K , respectively (often omitted in what follows). Let* μ *and* μ *be probability measures on K and K , respectively, and suppose that the associated measure spaces* (*K*,Σ,μ) *and* (*K* ,Σ ,μ ) *are isomorphic.*

*Then there exists a unitary operator*

$$
\mu: L^2(K, \mu) \to L^2(K, \mu')
$$

*such that*

$$
\mu L^{\infty}(K,\mu)\mu^{-1} = L^{\infty}(K',\mu'). \tag{B.378}
$$

Note that *u* does *not* intertwine the positions operators (B.332) on *L*2(*K*,μ) and *L*2(*K* ,μ ). These operators have already done their job in reducing the situation to *L*2(*K*,μ), and from that point onwards (B.378) is exactly what we need.

*Proof.* All maps appearing below are assumed Borel. The change-of-variables formula for a general (i.e., not necessarily invertible) map ϕ : *K* → *K* reads

B.17 Classification of maximal abelian ∗-algebras in *B*(*H*) 603

$$\int\_{K'} d(\mathfrak{q}\_\*\mu) \,\mathrm{g} = \int\_K d\mu \,\mathrm{g} \circ \mathfrak{q}, \tag{\text{B.379}}$$

where *g* : *K* → C. Under the assumption that ϕ *is* invertible, this can be rewritten as

$$\int\_{K'} d(\mathfrak{q}\_\*\mu) \, f \circ \mathfrak{q}^{-1} = \int\_K d\mu \, f,\tag{B.380}$$

where *f* : *K* → C. If ϕ is also an isomorphism of measure spaces, this becomes

$$\int\_{K'} d\mu' \, f \circ \mathfrak{g}^{-1} = \int\_K d\mu \, f. \tag{B.381}$$

If ϕ∗μ and μ are equivalent and hence mutually absolutely continuous, the Radon–Nikodym derivative *<sup>d</sup>*(ϕ∗μ)/*d*<sup>μ</sup> exists (as does its counterpart *<sup>d</sup>*(ϕ−<sup>1</sup> <sup>∗</sup> <sup>μ</sup> )/*d*μ), and using (B.137) and (B.380), one easily verifies that the operator

$$\mu \,:\, L^2(K, \mu) \to L^2(K, \mu');\,\tag{B.382}$$

*u*ψ = % *d*(ϕ∗μ) *<sup>d</sup>*<sup>μ</sup> <sup>ψ</sup> ◦ϕ−1, (B.383)

is isometric. Moreover, *u* is unitary, because it has an inverse, given by

$$\mu^{-1}: L^2(K', \mu') \to L^2(K, \mu); \tag{B.384}$$

$$
\mu^{-1} \mathcal{X} = \sqrt{\frac{d(\mathfrak{q}\_\*^{-1} \mu')}{d\mu}} \mathcal{X} \circ \mathfrak{q}, \tag{\text{B.385}}
$$

We give these general expressions for later use; if ϕ∗μ = μ , they simplify to

$$
\mu \Psi = \Psi \circ \boldsymbol{\Phi}^{-1} ; \tag{\text{B.386}}
$$

$$
\mu^{-1}\mathcal{X} = \mathcal{X} \circ \Phi.\tag{\mathbb{B}.387}
$$

For *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*∞(*K*,μ) we then have (cf. Proposition B.73)

$$
\mu m\_f \mu^{-1} = m\_{f \diamond \mathfrak{g}^{-1}}.\tag{B.388}$$

We already know that the map *<sup>f</sup>* → *mf* injects *<sup>L</sup>*∞(*K*,μ)isometrically into *<sup>B</sup>*(*L*2(*K*,μ)), and analogously for *L*∞(*K* ,μ ). Furthermore, The map *<sup>f</sup>* → *<sup>f</sup>* ◦ <sup>ϕ</sup>−<sup>1</sup> gives an isomorphism *L*∞(*K*,μ) <sup>∼</sup><sup>=</sup> <sup>→</sup> *<sup>L</sup>*∞(*<sup>K</sup>* ,μ ): the property

$$||f \circ \Phi^{-1}||\_{\infty}^{\mathrm{ess}} = ||f||\_{\infty}^{\mathrm{ess}},\tag{\mathsf{B.389}}$$

which yields injectivity, may be checked either from (B.240) or from the assumed isomorphism of measures (and hence equivalence of measures, which in fact suffices for this purpose), whereras invertibility of <sup>ϕ</sup> gives surjectivity (since *<sup>g</sup>* <sup>∈</sup> *<sup>L</sup>*∞(*<sup>K</sup>* ,μ ) is the image of *<sup>f</sup>* <sup>=</sup> *<sup>g</sup>* ◦<sup>ϕ</sup> <sup>∈</sup> *<sup>L</sup>*∞(*K*,<sup>μ</sup> ). Eq. (B.378) follows. - The final step of the proof appeals to a deep and fundamental classification theorem in measure theory, which goes back to Kuratowski in a form that applies to general Polish (i.e., complete separable metric) spaces. This theorem implies:

Lemma B.121. *Let* (*K*,Σ,μ) *be a infinite probability space (in that infinitely many different elements of* Σ *have positive measure), where K* ⊂ R *is compact and* Σ *is the* σ*-algebra inherited from the Borel structure on* R*. Then* (*K*,Σ,μ) *is isomorphic to exactly one of the following possibilities (called* standard measure spaces*):*


*where* μ*<sup>n</sup> is an arbitrary strictly nonzero probability measure on the n-point set*

$$\underline{n}' \equiv \{1/n, \dots, (n-1)/n, 1\}.\tag{B.390}$$

Here we have stated the result in terms of *probability* measures μ on *compact* spaces *K* ⊆ [0,1]; this is convenient in the context of our proof. To understand the last two cases, for general measure spaces (*X*,Σ,μ) we say that *A* ∈ Σ is an *atom* if for any *B* ⊂ *A* we have either μ(*B*) = 0 or μ(*A*\*B*) = 0 (but not both; this implies μ(*A*) > 0, whence an equivalent definition of an atom as a set *A* ∈ Σ having positive measure as well as the property that if some measurable subset *B* ⊂ *A* has measure μ(*B*) < μ(*A*), then μ(*B*) = 0). In our case at hand (*K*,μ), each atom *A* contains a point *x* ∈ *K* such that μ(*A*) = μ({*x*}) and μ(*A*\{*x*}) = 0, so that modulo null sets we may identify each atom *A* with the measure-carrying point *x* it contains. Moreover, *K* can contain at most a countable set A = {*xn*}*<sup>n</sup>* of such points *xn*. The formulae

$$
\mu = \mu\_a + \mu\_c;\tag{\text{B.391}}
$$

$$
\mu\_a(A) = \mu(A \cap \mathcal{A}');\tag{\mathbb{B}.392}
$$

$$
\mu\_c(A) = \mu(A \backslash (A \cap \varphi \urcorner)),
\tag{B.393}
$$

then give the canonical decomposition of μ into an *atomic* part μ*<sup>a</sup>* and a *continuous* part μ*c*. This, then, is the sense in which the last two cases of Lemma B.121 are meant. Note that characteristic functions 1*<sup>A</sup>* on atoms *A* ⊂ *K* yield atomic projections in *L*∞(*K*,μ), linking the two notions of atomicity that play a role in this proof.

The first entry of this lemma yields the first entry in the list in the theorem. To obtain the others, we need a few more unitary equivalences. For the second, define

$$
\mu \,:\, L^2(\underline{\mathbb{N}}', \mu') \to \ell^2(\mathbb{N});\tag{\mathbb{B}.394}
$$

$$
\mu \Psi(n) = \sqrt{\mu'(n)} \Psi(2^{-n}),
\tag{B.395}
$$

and *u*ψ(1) irrelevant. This operator is unitary and, just like in (B.378), it intertwines

$$
\mu L^{\infty}(\underline{\mathbb{N}}', \mu')\mu^{-1} = \ell^{\infty}(\mathbb{N}).\tag{\mathbb{B}.396}
$$

Note that (B.394) is a special case of (B.383)). The third and fourth cases require the following construction: if A ⊂ *K* is the set of atoms in (*K*,Σ,μ), we decompose

$$K = (K \backslash \mathcal{J}) \bigsqcup \mathcal{J},\tag{B.397}$$

as a disjoint union. For any measure μ this induces an orthogonal decomposition

$$L^2(K,\mu) = L^2(K \backslash \mathcal{A}, \mu) \oplus L^2(\mathcal{A}, \mu);\tag{B.398}$$

$$L^2(K \backslash \mathcal{A}, \mu) = eL^2(K, \mu);\tag{B.399}$$

$$L^2(\mathcal{A}, \mu) = (1\_{L^2(K, \mu)} - e)L^2(K, \mu), \tag{B.400}$$

where *e* = 1*K*\<sup>A</sup> and 1*L*2(*K*,μ) −*e* = 1<sup>A</sup> are projections. Using (B.391), this gives

$$L^2(K\backslash\mathcal{J},\mu) = L^2(K,\mu\_c);\tag{B.401}$$

$$L^2(\mathcal{J}, \mu) = L^2(\mathcal{J}, \mu\_a),\tag{B.402}$$

so that at the end of the day we obtain

$$L^2(K, \mu) = L^2(K, \mu\_c) \oplus L^2(\mathcal{M}, \mu\_a). \tag{B.403}$$

This in turn induces the decomposition

$$L^{\infty}(K,\mu) = L^{\infty}(K,\mu\_c) \oplus L^{\infty}(\mathcal{J},\mu\_a);\tag{B.404}$$

$$L^{\curvearrowleft}(K,\mu\_c) = eL^{\curvearrowleft}(K,\mu) = eL^{\curvearrowright}(K,\mu)e;\tag{B.405}$$

$$\begin{split} L^{\infty}(\omega^{\prime}, \mu\_{a}) &= (1\_{L^{2}(K,\mu)} - e)L^{\infty}(K,\mu) \\ &= (1\_{L^{2}(K,\mu)} - e)L^{\infty}(K,\mu)(1\_{L^{2}(K,\mu)} - e). \end{split} \tag{B.406}$$

Combined with (B.396), this shows that the third entry of the lemma yields the third entry of the theorem. To obtain the fourth and last, we need the unitary map

$$\mu \,:\, L^2(\underline{\eta}', \mu\_n) \to \mathbb{C}^n;\,\tag{\text{B.407}}$$

$$
\mu \Psi\_m = \sqrt{\mu\_n(m/n)} \Psi(m/n) \ (m = 1, \dots, n), \tag{B.408}
$$

which delivers the unitary equivalence

$$
u L^{\infty}(\underline{\eta}', \mu\_n) \mu^{-1} = D\_n(\mathbb{C}).\tag{\mathbb{B}.409}$$

Short of a proof of Lemma B.121, we have (at last!) proved Theorem B.118. -

Thus one of the remarkable novelties of infinite-dimensional Hilbert space is that even in the separable case, uniqueness of maximal abelian ∗-algebras is lost.

There is a different proof of Theorem B.118 that does not rely on Kuratowski's Lemma B.121, but instead is based on properties of the projection lattice P(*A*) in *A*. In the following outline of this proof, *A* is a maximal abelian ∗-subalgebra of *B*(*H*), where *H* is a separable Hilbert space. Hence *A* is a von Neumann algebra, which is generated by its projections. This leaves three mutually exclusive possibilities:


The following lemma, whose proof we merely sketch, replaces Lemma B.121.

Lemma B.122. *If H is separable and A* ⊂ *B*(*H*)*, then* P(*A*) *contains a maximal totally ordered set* T (*A*) *that generates A (as a von Neumann algebra).*

*Proof.* This is proved in two steps. First, P(*A*) contains a *countable* subsets P*c*(*A*) that generates *A*. Indeed, according to Lemma B.111 and Proposition B.114 (and maximality of *A*), *H* contains a unit vector ψ that is both cyclic and separating for *A*. Since *H* is separable, P(*A*)ψ ⊂ *H* has a countable dense subset, which is P*c*(*A*).

The second step is trickier, namely to construct a maximal totally ordered set T (*A*) from P*c*(*A*). This is done inductively. We number P*c*(*A*) = {*e*1, *e*2,...}. Starting from P<sup>1</sup> = {0*H*, *e*1,1*H*}, we now construct finite totally ordered sets P*<sup>n</sup>* of projections such that P*<sup>n</sup>* ⊂ P*n*+<sup>1</sup> and *en* lies in the linear span of P*n*. Let

$$\mathcal{AP}\_n = \{e'\_0 = 0\_H, e'\_1, \dots, e'\_{r\_n - 1}, e'\_{r\_n} = 1\_H\},\tag{B.410}$$

where *e* <sup>1</sup> < ··· < *e rn* (where *e* < *f* means *e* ≤ *f* and *e* = *f*), and define

$$
\beta \mathcal{P}\_{n+1} = \mathcal{P}\_n \cup \{e\_i^\prime + (e\_{i+1}^\prime - e\_i^\prime)e\_{n+1}, i = 0, \dots, r\_n - 1\}. \tag{B.411}
$$

Given the total ordering in P*n*, it is easy to see that each *e <sup>i</sup>* + (*e <sup>i</sup>*+<sup>1</sup> − *e i* )*en*+<sup>1</sup> is indeed a projection, and, by the same token, that P*n*+<sup>1</sup> meets its specification. Let

$$
\mathcal{J}\mathcal{P}\_{\infty} = \cup\_n \mathcal{P}\_n,\tag{B.412}
$$

which remains totally ordered but typically is infinite, and take the poset P of all totally ordered subsets of P(*A*) that contain P∞, ordered by inclusion. Zorn's Lemma then yields a maximal element of P, and this is our T (*A*): this maximal element is itself totally ordered, and since its linear span contains each projection *en* ∈ P*c*(*A*), the projections in T (*A*) generate *A* (since the *en* already do so). -

The above trichotomy then leaves the following possibilities:

1. Let ψ ∈ *H* be a unit vector that is cyclic and separating for *A*. Then

$$\mathcal{A} \colon \mathcal{P}(\mathcal{A}) \to [0, 1]; \tag{B.413}$$

$$e \mapsto \langle \Psi, e\Psi \rangle,\tag{\text{B.414}}$$

is an isomorphism of posets. It is easy to show that the linear span of the set of all vectors <sup>α</sup>−1(*t*)ψ, *<sup>t</sup>* <sup>∈</sup> [0,1], is dense in *<sup>H</sup>*, and that the map

$$
\mu \mathfrak{a}^{-1}(t)\Psi = \mathbf{1}\_{(0,t)} \tag{\mathbb{B}.415}
$$

extends (by linearity and continuity) to a unitary isomorphism

B.17 Classification of maximal abelian ∗-algebras in *B*(*H*) 607

$$
u: H \to L^2(0, 1), \tag{B.416}$$

which intertwines *A* with *L*∞(0,1) in the sense that

$$
u \!Au \!u^{-1} = L^{\infty}(0, 1). \tag{B.417}$$

	- Each minimal projection *ei* in P(*A*) is one-dimensional.
	- Different minimal projections are orthogonal.
	- 1*<sup>H</sup>* = ∑*<sup>i</sup> ei* (strongly), where the sum is over all minimal projections in *A*.

Since *H* is separable, we may assume *i* ∈ N, so that we obtain a countable basis (υ*i*) of *H* in which *ei* = |υ*i*υ*i*|, and hence have a unitary isomorphism

$$
\mu: H \to \ell^2(\mathbb{N});\tag{\mathbb{B}.418}
$$

$$
\upsilon\_{l} \mapsto \delta\_{l},\tag{B.419}
$$

i.e., *u* is defined by linear and continuous extension of (B.419). Clearly,

$$
u \!\!Au^{-1} = \ell^{\infty}(\mathbb{N}).\tag{\mathbb{B}.420}$$

3. The first part of the analysis in the previous item still applies, but this time, the sum *e* = ∑*<sup>i</sup> ei* over all minimal projections in *A* is not equal to 1*H*. If there are *n* ∈ N such projections, we obtain

$$eH \cong \mathbb{C}^n,\tag{B.421}$$

and otherwise

$$
\iota \, eH \cong \ell^2(\mathbb{N}).\tag{\mathbb{B}.422}$$

We combine these in the notation

$$eH \cong \ell^2(\mathbf{x}),\tag{\text{B.423}}$$

where κ = *n*, in which case -<sup>2</sup>(κ) = C*<sup>n</sup>* and -<sup>∞</sup>(κ) = *Dn*(C), or κ = N. Furthermore, we have

$$(1\_H - e)H \cong L^2(0, 1),\tag{B.424}$$

as in the first item. By construction, the corresponding unitary

$$
\mu: H \to \ell^2(\kappa) \oplus L^2(0, 1) \tag{B.425}
$$

then satisfies

$$
\mu \mathbf{A} \mu^{-1} = \ell^{\infty}(\mathbf{x}) \oplus L^{\infty}(0, 1). \tag{\mathbb{B}.426}
$$

This finishes the alternative proof (sketch) of Theorem B.118.

#### B.18 Compact operators

The spectral theorem (in whatever version) on infinite-dimensional Hilbert spaces considerably simplifies for a class of well-behaved operators called *compact*.

Definition B.123. *A linear map a* : *V* → *W between Banach spaces V*,*W is called* compact *if for some (and hence all) d* > 0 *the image a*(*V*≤*d*) *of the closed d-ball*

$$V\_{\leq d} = \{ \nu \in V : \|\nu\| \leq d \}\tag{\mathbb{B}.427}$$

*is pre-compact in W (i.e., its closure a*(*V*≤*d*)<sup>−</sup> *is compact), or, equivalently, if the image* (*avn*) *of any bounded sequence* (*vn*) *in V has a convergent subsequence.*

Before turning to Hilbert spaces, we mention two facts of general interest.

Proposition B.124. *A compact operator is bounded.*

*Proof.* If not, then for any *n* ∈ N there is some *vn* ∈ *V*≤<sup>1</sup> for which *avn* ≥ *n*, so that (*avn*) cannot possibly have a convergent subsequence. -

Proposition B.125. *A compact operator a* : *V* → *W maps weakly convergent sequences in V to norm-convergent sequences in W.*

*Proof.* Let (*vn*) be a sequence in *V* that weakly converges to *v*. It is easy to show that if *a* : *V* → *W* is (norm) continuous, then it maps weakly convergent sequences in *V* to weakly convergent sequences in *W*. Therefore, the sequence (*avn*) weakly converges to *av*. If (*avn*) failed to converge to *av* in norm, then it would have a subsequence (*avnk* ) such that for some ε > 0 and all sufficiently large *k* one had

$$\|av\_{n\_k} - av\| \ge \varepsilon. \tag{B.428}$$

However, (*vn*), being weakly convergent, is bounded by Lemma B.126 below, and hence also its subsequence (*vnk* ) must be bounded. Since *a* is compact, (*avnk* ) has some norm-convergent subsequence, which necessarily converges to *av* (since we know this is the weak limit of the ambient sequence (*avn*) and hence also of any of its subsequences, and if a norm-limit exists, the corresponding weak limit must be the same). But for large enough *k* this convergence flatly contradicts (B.428). -

Lemma B.126. *A weakly convergent sequence in a Banach space is bounded.*

*Proof.* Since *vn* → *v* weakly, the sequence (ϕ(*vn*)) in C converges to ϕ(*v*) for each ϕ ∈*V*∗, so that sup*n*{|ϕ(*vn*)|} < ∞. Using the notation (B.129), this may be rewritten as sup*n*{|*v*ˆ*n*(ϕ)|} < ∞. Using Theorem B.78 (with *V V*∗∗, *W* = C, and *X* = N), this implies sup*n*{*v*ˆ*n*} < ∞, and hence sup*n*{*vn*} < ∞ by Proposition B.44. -

Definition B.123 simplifies if *V* = *W* = *H* is a Hilbert space, since we have:

Proposition B.127. *If the image a*(*H*≤1) ⊂ *H of a linear map a* : *H* → *H is precompact, then this image is in fact compact (and hence a is compact).*

For the proof, call a Banach space *V reflexive* if *V*∗∗ ∼= *V* (i.e. through the canonical injection *v* → *v*ˆ, cf. Proposition B.44). Hilbert spaces *H* are reflexive, since *H*<sup>∗</sup> ∼= *H* by Theorem B.66. Proposition B.127 then follows from yet another lemma:

Lemma B.128. *If V is a reflexive Banach space and a* : *V* → *W is compact, then a*(*V*≤1) *is compact.*

*Proof.* The proof relies on a corollary of the Banach–Alaoglu Theorem B.48, according to which *V*≤<sup>1</sup> is weakly compact if *V* is reflexive (indeed, by applying Banach–Alaoglu to *V*∗ instead of *V*, it follows that the unit ball in *V*∗∗ is compact in its weak∗-topology; if, in addition, *V* is reflexive, then the inverse of the canonical injection *V* → *V*∗∗ maps the weak∗-topology on *V*∗∗ to the weak topology on *V*).

So let *a* :*V* →*W* be compact, and let *wn* be a sequence in *a*(*V*≤1), say *wn* = *avn* for some sequence (*vn*) in *V*≤1. Then since *V*≤<sup>1</sup> is weakly compact, *vn* has a weakly convergent subsequence *vnk* in *V*≤1, say lim*<sup>k</sup>*→<sup>∞</sup> *vnk* = *v weakly*. By Proposition B.125, lim*k*→<sup>∞</sup> *avnk* = *av in norm*. In other words, (*avn*) has a norm-convergent subsequence, namely (*avnk* ), with limit in *a*(*V*≤1). Hence *a*(*V*≤1) is compact. -

In view of Proposition B.127, we may as well take the following starting point:

Definition B.129. *If H is a Hilbert space, a linear map a* : *H* → *H is called* compact *when the image a*(*H*≤1) *of the closed unit ball in H is compact.*

We write *B*0(*H*) for the set of all compact operators on *H*.

Theorem B.130. *The compact operators B*0(*H*) *form a C\*-algebra in B*(*H*) *in the operations inherited from B*(*H*)*. Furthermore, B*0(*H*) *is a two-sided ideal in B*(*H*)*.*

Unfolding this theorem, the claim consists of the following parts:


*Proof.* The first clause is Proposition B.124, and the second and sixth (which implies the third) are almost trivial. For the fourth, we use the following criterion for pre-compactness (in a metric space): *K* ⊂ *H* is pre-compact iff for each ε > 0 it can be covered by a *finite* number of open ε-balls *B*<sup>ε</sup> (χ*i*) = {ψ ∈ *H* : ψ − χ*i* < ε}, where *i* = 1,...,*m* < ∞ (i.e., all balls have the same radius ε). Given that *an*−*a* → 0, for each ε > 0 there is *n* such that *an* −*a* < ε/2. Since *an*(*H*≤1) is compact, it has a finite cover with ε/2-balls; in other words, for each ψ ∈ *H*≤<sup>1</sup> there is an *i* such that *an*ψ − χ*i* < ε/2. Hence, as ψ ≤ 1, we may estimate

$$\|\|a\Psi - \mathfrak{X}\|\| \le \|(a\_n - a)\Psi\| + \|a\_n \Psi - \mathfrak{X}\| \le \|a\_n - a\| \|\|\Psi\|\| + \frac{1}{2}\mathfrak{c} < \frac{1}{2}\mathfrak{c} + \frac{1}{2}\mathfrak{c} = \mathfrak{c}.\square$$

.

So *a*(*H*≤1) has a finite cover with ε-balls and hence is pre-compact. This finishes the proof from Definition B.123; from Definition B.129, invoke Proposition B.127.

To prove the fifth clause, we need a result of independent interest. We say that a linear map *a* : *H* → *H* is (or has) *finite rank* if its image is finite-dimensional.

Proposition B.131. *A bounded operator a* ∈ *B*(*H*) *is compact iff it is a norm-limit of finite-rank operators.*

*Proof.* Since it is easy to see that finite-rank operators are compact, the "⇐" direction follows from clause 4 of Theorem B.130. The difficult direction is the opposite one, which we prove by contradiction (as a technical note, our proof assumes that *H* is separable, but the claim also holds in the non-separable case, in which it can be shown that ran(*a*) is separable whenever *a* is compact).

Pick a basis (υ*i*) of *H* (or, in the non-separable case, of ran(*a*)), and define *en* to be the projection onto the linear span of the first *n* basis vectors. Given some *a* ∈ *B*0(*H*), define *an* = *ena*. We show that *an* −*a* → 0. If not, then

$$\exists \mathfrak{c} > 0 \,\forall N \, \exists n > N : ||a\_n - a|| \ge \mathfrak{c},\tag{B.429}$$

which in turn implies that for any δ > 0 there are unit vectors ψ*<sup>n</sup>* for which we have (*an* −*a*)ψ*n* ≥ ε −δ. Take δ = ε/2, whence

$$\exists \mathfrak{c} > 0 \,\forall N \,\exists n > N \,:\, \left|| (a\_n - a) \Psi\_n \right|| \ge \mathfrak{c}/2. \tag{B.430}$$

Now *a* is compact, so that, noting that ψ*<sup>n</sup>* ∈ *H*≤1, the sequence (*a*ψ*n*) has a convergent subsequence, say with limit ϕ. We may then write

$$(a\_n - a)\Psi\_n = (e\_n - 1\_H)(a\Psi\_n - \mathfrak{p} + \mathfrak{p}),\tag{\text{B.431}}$$

so that, for each ψ*n*,

$$\left| \left| (a\_n - a)\Psi\_n \right| \right| \le \left| \left| (e\_n - 1\_H) \right| \right| \left| \left| a\Psi\_n - \Phi \right| \right| + \left| \left| (e\_n - 1\_H) \right| \Phi \right|. \tag{B.432}$$

If we now restrict the ψ*<sup>n</sup>* so as to lie in the convergent subsequence in question, then the right-hand side vanishes as *n* → ∞:


However, this contradicts (B.430). -

We use the notation of this proof to establish the fifth clause of Theorem B.130. By the sixth, the operator *a*∗ *<sup>n</sup>* = *a*∗*en* is compact, since any finite-rank operator such as *en* is compact and *a*<sup>∗</sup> is bounded. Therefore, *a*<sup>∗</sup> *<sup>n</sup>*−*a*∗ = *an*−*a* → 0, so *a*<sup>∗</sup> *<sup>n</sup>* → *a*<sup>∗</sup> and hence *a*<sup>∗</sup> ∈ *B*0(*H*) by clause 4. -

#### B.19 Spectral theory for self-adjoint compact operators

If only to establish our notation, let us begin by recalling Theorem A.10:

Theorem B.132. *Let* dim(*H*) < ∞ *and let a* : *H* → *H be a self-adjoint operator. Then the eigenvalues* λ *of a are real (collected in the point spectrum* σ*p*(*a*) ⊂ R*), the eigenspaces H*<sup>λ</sup> *corresponding to different eigenvalues* λ *are orthogonal, and we have the* spectral resolutions

$$a = \sum\_{\lambda \in \sigma\_{\mathcal{P}}(a)} \lambda \cdot e\_{\lambda};\tag{B.433}$$

$$1\_H = \sum\_{\lambda \in \mathfrak{o}\_\rho(a)} e\_\lambda,\tag{B.434}$$

*where e*<sup>λ</sup> *is the projection onto the eigenspace*

$$H\_{\lambda} = \{ \Psi \in H \mid a\Psi = \lambda\Psi \}. \tag{B.435}$$

This theorem is equivalent to the following alternative version:

Theorem B.133. *Let* dim(*H*) < ∞ *and let a* : *H* → *H be a self-adjoint operator (i.e., a*<sup>∗</sup> = *a). Then a is* diagonalizable*, in the sense that H has a basis* (υ*i*) *consisting of eigenvectors of a. Furthermore, the eigenvalues* λ*<sup>i</sup> of a are real.*

If *a* is diagonalizable, using the familiar notation *e*υ*<sup>i</sup>* = |υ*i*υ*i*|, cf. (2.7), we write

$$a\mathfrak{v}\_{l} = \mathfrak{A}\_{i}\mathfrak{v}\_{l};\tag{\text{B.436}}$$

$$a = \sum\_{i \in I} \lambda\_i e\_{\upsilon\_i}. \tag{B.437}$$

To move from Theorem B.132 to Theorem B.133, pick some basis (υ(λ) *<sup>k</sup>* ) of each eigenspace *H*<sup>λ</sup> . By Proposition A.8 we then have

$$e\_{\lambda} = \sum\_{k=1}^{\dim(H\_{\lambda})} |\boldsymbol{\upsilon}\_{k}^{(\lambda)}\rangle\langle\boldsymbol{\upsilon}\_{k}^{(\lambda)}|.\tag{B.438}$$

The totality of all υ(λ) *<sup>k</sup>* , where λ ∈ σ*p*(*a*) and *k* = 1,...,dim(*H*<sup>λ</sup> ) is our basis: relabeling this set as (υ*i*), eq. (B.434) becomes 1*<sup>H</sup>* = ∑*<sup>i</sup>* |υ*i*υ*i*|, or ψ = ∑*<sup>i</sup> ci*ψ*<sup>i</sup>* with *ci* = υ*i*,ψ for each ψ ∈ *H*, which according to Theorem B.61.1 shows that (υ*i*) is a basis of *H* (and hence *i* = 1,...,dim(*H*)). Furthermore, (B.433) yields *a*υ(λ) *<sup>k</sup>* <sup>=</sup> λ υ(λ) *<sup>k</sup>* , or (B.436), so that each υ*<sup>i</sup>* is an eigenvector of *a*.

Conversely, for each λ ∈ σ*p*(*a*), assemble all eigenvalues λ*<sup>i</sup>* that are equal to λ and relabel those as υ(λ) *<sup>k</sup>* . This yields *e*<sup>λ</sup> through (B.438), and the above argument may be rerun in the opposite direction: the basis property of the (υ*i*) implies (B.434), and the eigenvector property (B.436) yields (B.434) by verifying it on each basis vector <sup>υ</sup>*<sup>i</sup>* <sup>≡</sup> <sup>υ</sup>(λ) *<sup>k</sup>* , recalling that by construction, λ*<sup>i</sup>* = λ.

We now adapt these results to infinite dimension. We still say that an operator *a* : *H* → *H* is *diagonalizable* if *H* has a basis (υ*i*) consisting of eigenvectors of *a*.

Proposition B.134. *Let H* ∼= -<sup>2</sup>(*I*) *for some set I (i.e., H has a basis* (υ*i*)*i*∈*I). Then some collection* (λ*i*)*i*∈*<sup>I</sup> of complex numbers occurs as the set of eigenvalues of some bounded operator a* ∈ *B*(*H*) *iff* (λ*i*)*i*∈*<sup>I</sup> is bounded, i.e.,* sup{|λ*i*|,*i* ∈ *I*} < ∞*.*

Defining a function ˜ <sup>λ</sup> : *<sup>I</sup>* <sup>→</sup> <sup>C</sup> by ˜ λ(*i*) = λ*i*, we may express this as ˜ λ ∈ -<sup>∞</sup>(*I*).

*Proof.* If *a* ∈ *B*(*H*) is diagonal in some basis (υ*i*), with eigenvalues (λ*i*), then

$$|\lambda\_l| = ||\lambda\_l \mathbf{u}\_l|| = ||a\mathbf{u}\_l|| \le ||a|| ||\mathbf{u}\_l|| = ||a||,\tag{B.439}$$

for each *i* ∈ *I*, whence the eigenvalues are bounded. Conversely, if they are, so that ˜ λ<sup>∞</sup> < ∞, take a basis (υ*i*)*i*∈*<sup>I</sup>* of *H*, write ψ = ∑*<sup>i</sup> ci*υ*<sup>i</sup>* with ∑*<sup>i</sup>* |*ci*| <sup>2</sup> < ∞, cf. Theorem B.61 and define *a*ψ = ∑*<sup>i</sup>* λ*ici*υ*i*. Since

$$\sum\_{i} |\lambda\_{i} c\_{i}|^{2} \le ||\tilde{\lambda}||\_{\infty}^{2} \sum\_{i} |c\_{i}|^{2} = ||\tilde{\lambda}||\_{\infty}^{2} ||\Psi||^{2} < \infty,\tag{B.440}$$

we have *<sup>a</sup>*<sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>* by Lemma B.59. These estimates also prove that *a*ψ≤˜ λ∞ψ, so that *<sup>a</sup>* is bounded, with *a*≤˜ λ<sup>∞</sup> (in fact, equality holds here). -

This characterization of bounded diagonalizable operators by a property of their eigenvalues may be considerably sharpened for self-adjoint compact operators.

Theorem B.135. *Let* dim(*H*) = ∞*, and let a* ∈ *B*(*H*)sa*. Then a is compact iff it is diagonalizable with* ˜ λ ∈ -<sup>0</sup>(*I*)*, in which case the sum in* (B.437) *converges in norm.*

We recall that some function *f* : *I* → C is in -<sup>0</sup>(*I*) if for each ε > 0 there is a *finite* subset *I*<sup>ε</sup> ⊂ *I* such that | *f*(*i*)| < ε for all *i* ∈/ *I*<sup>ε</sup> . If *I* = N (and in fact the proof below will produce this labeling of the basis), then the condition ˜ λ ∈ -<sup>0</sup>(N) just means that

$$\lim\_{n \to \infty} \lambda\_n = 0.\tag{B.441}$$

Before proving this, we state the infinite-dimensional analogue of Theorem B.132:

Theorem B.136. *Let* dim(*H*) = ∞ *and let a be some bounded self-adjoint operator. Then a is compact iff it has the properties stated in Theorem B.132, amended by the following clarifications and addenda (cf. Definition B.6, where X* = σ*p*(*a*)*):*


$$\Psi = \sum\_{\lambda \in \sigma\_p(a)} e\_{\lambda} \,\Psi;\tag{B.442}$$

*3. If* λ ∈ σ(*a*) *and* λ = 0*, then* λ ∈ σ*p*(*a*) *and* dim(*H*<sup>λ</sup> ) < ∞*;*

*4. Always* 0 ∈ σ(*a*)*, and* σ*p*(*a*) ⊂ R *has 0 as its only accumulation point.*

The equivalence between Theorems B.135 and B.136 is a bit more subtle than in finite dimension, but the key to the proof of both is the following lemma.

## Lemma B.137. *A compact self-adjoint operator a has an eigenvalue* λ = ±*a.*

Note that by definition of the operator norm, one always has |λ|≤*a*, whether or not *a* is compact, but the point about compact self-adjoint operators is firstly that *they have an eigenvalue at all*, and secondly that the above equality is saturated.

*Proof.* We use the fact that the norm ψ → ψ is continuous on *H*, see (B.5), so that it attains a maximum on the compact set *a*(*H*≤1). Assume that this maximum is attained at *a*ψ1, with ψ1 = 1. By definition of the operator norm, this maximum must be *a*, so that *a*<sup>2</sup> <sup>=</sup> *a*ψ12. Cauchy–Schwarz and *<sup>a</sup>*<sup>∗</sup> <sup>=</sup> *<sup>a</sup>* then yield

$$\|\|a\|\|^2 = \langle a\Psi\_\mathbf{l}, a\Psi\_\mathbf{l} \rangle = \langle \Psi\_\mathbf{l}, a^2\Psi\_\mathbf{l} \rangle \le \|\|\Psi\_\mathbf{l}\|\| \|a^2\Psi\_\mathbf{l} \|\| \le \|a^2\|\| = \|\|a\|\|^2,\tag{\mathbf{B.443}}$$

where we have used (C.2). In the Cauchy–Schwarz inequality (A.1) one has equality iff either *<sup>v</sup>* <sup>=</sup> 0 or *<sup>w</sup>* <sup>=</sup> *zv* for some *<sup>z</sup>* <sup>∈</sup> <sup>C</sup>, so that we must have *<sup>a</sup>*2ψ<sup>1</sup> <sup>=</sup> *<sup>z</sup>*ψ1, with <sup>|</sup>*z*<sup>|</sup> <sup>=</sup> *a*2. Moreover, *<sup>z</sup>* <sup>∈</sup> <sup>R</sup>, as eigenvalues must be real (which trivially follows from *a*<sup>∗</sup> = *a*, one does not even need Theorem B.93 here), so *a*2ψ<sup>1</sup> = λ2ψ1, with either λ = *a* or λ = −*a*. If *a*ψ<sup>1</sup> = λψ1, we are ready. If not, then χ<sup>1</sup> = *a*ψ1−λψ<sup>1</sup> = 0, in which case *<sup>a</sup>*χ<sup>1</sup> <sup>=</sup> *<sup>a</sup>*2ψ<sup>1</sup> <sup>−</sup>λ*a*ψ<sup>1</sup> <sup>=</sup> <sup>λ</sup>2ψ<sup>1</sup> <sup>−</sup>λ*a*ψ<sup>1</sup> <sup>=</sup> <sup>−</sup>λ χ1. -

#### Corollary B.138. *A compact self-adjoint operator is diagonalizable.*

*Proof.* Using the notation of the above proof, we call the (normalized) eigenvector in question υ<sup>1</sup> (so either υ<sup>1</sup> = ψ<sup>1</sup> or υ<sup>1</sup> = χ1). Note that if ϕ,υ1 = 0, then *a*ϕ,υ1 = ϕ,*a*∗υ1 = ϕ,*a*υ1 = ±λϕ,υ1 = 0, so that *a* maps the orthogonal complement υ⊥ <sup>1</sup> = {ϕ ∈ *H* | υ1,ϕ = 0} of υ<sup>1</sup> into itself. This implies that *a* commutes with the projection *e*<sup>1</sup> onto υ<sup>⊥</sup> <sup>1</sup> , i.e., *e*1*a* = *ae*<sup>1</sup> and hence also *e*1*a* = *e*1*ae*1, in which the right-hand side is essentially the restriction of *a* to υ⊥ <sup>1</sup> = *e*1*H*.

By Theorem B.130.6, the operator *e*1*a* is compact, like *a* itself, and it is also self-adjoint. If *e*1*a* = 0 we are ready, since υ<sup>1</sup> plus any basis of *e*1*H* is a basis of *H* that diagonalizes *a*. If not, we apply Lemma B.137 to the operator *e*1*a*, finding an eigenvector υ<sup>2</sup> with nonzero eigenvalue λ2. A simple computation shows that *e*1υ<sup>2</sup> = υ2, so that υ<sup>2</sup> ∈ *e*1*H*, from which we infer, in turn, that *a*υ<sup>2</sup> = λ2υ2.

So we have found two basis vectors (υ1,υ2) of *H* that are eigenvectors of *a*. The above procedure may then be iterated: we define *e*<sup>2</sup> as the projection onto the orthogonal complement of υ<sup>1</sup> and υ2, and consider *e*2*a*. If *e*2*a* = 0 we are ready; if not, we find a third eigenvector of *e*2*a* and hence of *a* in *e*2*H*, *et cetera*.


By Theorem B.61.5, the set *B* is a basis of *H* iff *B*<sup>⊥</sup> = {0}. To show that this is the case, suppose *B*⊥ is a nonzero Hilbert space. Define *f* as the projection on *H* with image *B*⊥ and consider the self-adjoint compact operator *f a*. If *f a* = 0, there is at least one eigenvector of *a* in *B*⊥ = *f H* (namely, with eigenvalue zero), which is a contradiction. If *f a* = 0, then *a* has an eigenvector by Lemma (B.137), and again a contradiction has been found: for in all three cases, by construction all eigenvectors were already contained in *B*⊥⊥ = span(υ1,...)−. -.

Even if *H* is non-separable, the image of a compact operator *a* must nonetheless be separable. Therefore, the non-zero eigenvalues of *a* form a countable set, and the eigenvalue zero (which, by the same token, must occur in the non-separable case) has some uncountable multiplicity (in sharp contrast to which, each nonzero eigenvalue has finite multiplicity). Also in the separable case, the only eigenvalue that may have infinite multiplicity is zero (though in the separable case it does not necessarily occur). Theorem B.135 is now a consequence of the following lemma:

#### Lemma B.139. *A diagonalizable operator a is compact iff* ˜ λ ∈ -<sup>0</sup>(*I*)*.*

*Proof.* In view of the proof and subsequent comment above, we may as well assume that *I* = N. For any ψ ∈ *H*, the sum in (B.214) converges, so we must have lim*n*υ*n*,ψ = 0, or, in other words, υ*<sup>n</sup>* → 0 weakly. If *a* ∈ *B*0(*H*), then *a*υ*<sup>n</sup>* → 0 in norm by Proposition B.125, and hence <sup>λ</sup>*<sup>n</sup>* <sup>→</sup> 0, i.e., ˜ λ ∈ -<sup>0</sup>(N). Conversely, if this holds, then for each ε > 0, the set *I*<sup>ε</sup> = {*n* ∈ N : |λ*n*| ≥ ε} is finite. This implies that the operator *an* = ∑*m*∈*I*1/*<sup>n</sup>* λ*me*<sup>υ</sup>*<sup>m</sup>* has finite rank. Since |λ*m*| < ε whenever *m* ∈/ *I*1/*n*,

$$\left\|(\left(a\_n - a\right)\Psi\right\|^2 = \left|\sum\_{m \notin I\_{1/n}} \lambda\_m e\_{\mathfrak{d}\_m} \Psi\right\|^2 \le \sum\_{m \notin I\_{1/n}} |\lambda\_m|^2 |\langle \psi\_m, \Psi \rangle|^2 \le \varepsilon^2 \|\left\|\Psi\right\|\|^2,\tag{\mathbf{B}.444}$$

where in the last step we also used (B.213). Hence *an* → *a* in norm, so that *a* is compact by Proposition B.131. -

To finish the proof of Theorem B.135, we show that the sum in (B.437), which for general bounded diagonalizable operators converges strongly, in fact converges in norm. To put this in perspective, eq. (B.437) with *a* = 1*<sup>H</sup>* reads

$$1\_H = \sum\_{i \in I} e\_{\mathfrak{v}\_i}.\tag{B.445}$$

If *I* is infinite, this sum cannot converge uniformly: e.g., if we take *I* = N, then

$$\lim\_{N \to \infty} \left\| 1\_H - \sum\_{n=1}^N e\_{\upsilon\_l} \right\| = \lim\_{N \to \infty} \sup \left\{ \left\| \left\| \Psi - \sum\_{n=1}^N \langle \upsilon\_n, \Psi \rangle \upsilon\_n \right\| , \left\| \Psi \in H\_{\le 1} \right\| \right\} \tag{B.446} \ (\mathbf{B}.446)$$

cannot be zero, as shown by taking ψ orthogonal to all υ1,...,υ*N*. However, by Theorem B.61.1 the sum does converge strongly (i.e., applied to each fixed ψ). This seemingly special case even yields strong convergence of the sum in (B.437) for general diagonalizable bounded operators *a*, for by continuity of *a* we have:

B.19 Spectral theory for self-adjoint compact operators 615

$$a\Psi = a\sum\_{i\in I} \langle \mathfrak{v}\_i, \Psi \rangle \mathfrak{v}\_i = \sum\_{i\in I} \langle \mathfrak{v}\_i, \Psi \rangle a\mathfrak{v}\_i = \sum\_{i\in I} \lambda\_i \langle \mathfrak{v}\_i, \Psi \rangle \mathfrak{v}\_i = \sum\_{i\in I} \lambda\_i e\_{\mathfrak{v}\_i} \Psi. \tag{B.447}$$

If *a* is compact, strong convergence of (B.437) may be strengthened to norm convergence. The argument is analogous to the proof of Lemma B.139, but for completeness and contrast we now present it for general *I*. Since ˜ λ ∈ -<sup>0</sup>(*I*), for given ε > 0 there is a finite set *I*<sup>ε</sup> ⊂ *I* for which |λ*i*| < ε for all *i* ∈/ *I*<sup>ε</sup> . For fixed ψ ∈ *H*, we have

$$\left\| \left( a - \sum\_{i \in I\_{\mathfrak{c}}} \lambda\_i e\_{\mathfrak{y}\_l} \right) \Psi \right\|^2 = \left\| \sum\_{i \notin I\_{\mathfrak{c}}} \lambda\_i e\_{\mathfrak{y}\_l} \Psi \right\|^2 < \varepsilon^2 \sum\_{i \notin I\_{\mathfrak{c}}} |\langle \mathfrak{y}\_l, \Psi \rangle|^2 \le \varepsilon^2 ||\Psi||^2,\qquad(\mathbf{B.448})$$

so that *a*−∑*i*∈*I*<sup>ε</sup> λ*ie*<sup>υ</sup>*<sup>i</sup>* < ε. By Definition B.6, eq. (B.437) holds in norm. -

This analysis by no means contradicts Corollary B.104, including (B.327): applied to compact operators, exactly one of the subsets *Ai*<sup>0</sup> ⊂ σ(*a*) contains σ(*a*)∩*U*0, where *U*<sup>0</sup> is some neighborhood of 0 ∈ σ(*a*), so that the corresponding projection *eAi* <sup>0</sup> is infinite-dimensional and all the other *eAi* are finite-dimensional. Thus the sum ∑*<sup>i</sup> eAi* in (B.327) takes a rather different form from either the sum ∑*<sup>i</sup> e*υ*<sup>i</sup>* in (B.445) or the sum ∑<sup>λ</sup> *e*<sup>λ</sup> in (B.434); see also the end of this section.

We now prove Theorem B.136. First, as soon as dim(*H*<sup>λ</sup> ) = ∞ for some λ = 0, then ˜ λ ∈/ -<sup>0</sup>(*I*). Therefore, dim(*H*<sup>λ</sup> ) < ∞ by Theorem B.135. In fact, is is easy to show directly that dim(ker(*a* − λ)) < ∞ for any *a* ∈ *B*0(*H*) and λ = 0: since *a* is bounded and hence ker(*a*−λ) is closed, the latter is a Hilbert space in its own right, so if it were infinite-dimensional, any basis (*un*) of it would have the property that *un* → 0 weakly and hence *aun* → 0 in norm (cf. the proof of the above lemma). But *aun* = λ*un*, so that (*aun*) cannot converge in norm as soon as λ = 0.

Second, take 0 = λ ∈ σ(*a*). According to Theorem B.93, in order to prove that λ ∈ σ*p*(*a*), it suffices to show that ran(*a*−λ) is closed. We may assume that λ = λ*<sup>i</sup>* for all *i* ∈ *I* (for otherwise, trivially λ ∈ σ*p*(*a*)), which implies ker(*a*−λ) = {0}.

Let ψ*<sup>n</sup>* = (*a*−λ)ϕ*<sup>n</sup>* ∈ ran(*a*−λ), with ϕ*<sup>n</sup>* = 0 for all *n*, and suppose ψ*<sup>n</sup>* → ψ. We prove that (ϕ*n*) is bounded. If not, then ϕ*n* → ∞, but since (ϕ *<sup>n</sup>*) is bounded, with ϕ *<sup>n</sup>* = ϕ*n*/ϕ*n*, and (ψ*n*) converges, we have (*a*−λ)ϕ *<sup>n</sup>* = ψ*n*/ϕ*n* → 0. Now *a* is compact, so (*a*ϕ *<sup>n</sup>*) has a convergent subsequence, which together with the previous result implies that (ϕ *<sup>n</sup>*) itself must have a convergent subsequence (as λ = 0), say to ϕ . Continuity of *a* gives (*a*−λ)ϕ = 0, hence ϕ *<sup>n</sup>* ∈ ker(*a*−λ) = {0}. But this is impossible, as ϕ *<sup>n</sup>* = 1 for all *n*. Thus knowing that (ϕ*n*) is bounded, once again using compactness of *a*, we infer that (*a*ϕ*n*) has a convergent subsequence. Now

$$
\mathfrak{q}\_n = \mathcal{X}^{-1}(a\mathfrak{q}\_n - (a - \lambda)\mathfrak{q}\_n) = \mathcal{X}^{-1}(a\mathfrak{q}\_n - \mathfrak{v}\_n),
\tag{B.449}
$$

and since (ψ*n*) converges by assumption, this implies that (ϕ*n*) has a convergent subsequence, say with limit ϕ. Continuity of *a* then implies that

$$
\Psi = (a - \lambda)\varphi \in \text{ran}(a - \lambda), \tag{B.450}
$$

and hence ran(*a*−λ) is closed. Therefore, λ ∈ σ*p*(*a*).

To show that 0 ∈ σ(*a*), assume that *a* were invertible (which is to say that <sup>0</sup> <sup>∈</sup> <sup>ρ</sup>(*a*)). Then its inverse *<sup>a</sup>*−<sup>1</sup> would be bounded, so that *<sup>a</sup>*−1*<sup>a</sup>* <sup>=</sup> <sup>1</sup>*<sup>H</sup>* <sup>∈</sup> *<sup>B</sup>*0(*H*) by Theorem B.130. But this is impossible in infinite dimension: a similar argument to the one below (B.445) shows that 1*<sup>H</sup>* cannot possibly be approximated by finiterank operators. The last claim of Theorem B.136 is the same as ˜ λ ∈ -<sup>0</sup>(*I*). -

Here is a nice example of compact operators, also justifying the notation *B*0(*H*).

Corollary B.140. *Let H* = -<sup>2</sup>(N) *and for f* <sup>∈</sup> -<sup>∞</sup>(N)*, define the multiplication operator mf as usual, i.e., mf*ψ = *f*ψ*. Then mf is compact iff f* ∈ -<sup>0</sup>(N)*.*

*Proof.* This follows from Theorem B.135, where the label set is *I* = N, the basis (υ*i*)*i*∈*<sup>I</sup>* is (δ*n*)*n*∈N, where δ*n*(*m*) = δ*nm* as usual, *m* ∈ N, and the eigenvalues are

$$
\lambda\_n = f(n), \tag{B.451}
$$

since obviously *mf* δ*<sup>n</sup>* = *f* δ*<sup>n</sup>* = *f*(*n*)δ*n*. We already know from (B.276) that σ(*mf*) = ran(*f*)−, which for *f* ∈ -<sup>0</sup>(N) equals ran(*f*) if 0 ∈ ran(*f*), and

$$
\text{ran}(f)^{-} = \text{ran}(f) \cup \{0\},
\tag{B.452}
$$

otherwise. In the first case, σ(*mf*) = σ*p*(*mf*) = ran(*f*), so σ*c*(*mf*) = 0, whereas / in the second case we have σ*p*(*mf*) = ran(*f*) and σ*c*(*mf*) = {0}. This also shows that in clause 4 of Theorem B.136, both possibilities 0 ∈ σ*p*(*a*) and 0 ∈ σ*c*(*a*) may occur, depending on *a*. Finally, the condition ˜ λ ∈ -<sup>0</sup>(*I*), which in the example *a* = *mf* reduces to (B.441), is just a restatement of the condition *f* ∈ -<sup>0</sup>(N). -

In the continuous case, for *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(*X*), say for some connected open set *<sup>X</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>n</sup>* with Lebesgue measure, the multiplication operator *mf* defined by a function *f* ∈ *C*0(*X*) is never compact, cf. (B.276); it is the very opposite of a compact operator!

To close, in our (traditional) proof of Theorem B.136 we did not use the powerful spectral Theorem B.94. If dim(*H*) < ∞, Theorem B.132 indeed follows from Theorem B.94: if, for λ ∈ R, we define 1{λ} ≡ δλ : R → C by δλ (*x*) = δλ*x*, then

$$\mathrm{id}\_{\sigma\_{\mathcal{P}}(a)} = \sum\_{\lambda \in \sigma\_{\mathcal{P}}(a)} \lambda \cdot \delta\_{\lambda};\tag{B.453}$$

$$1\_{\sigma\_p(a)} = \sum\_{\lambda \in \sigma\_p(a)} \delta\_\lambda. \tag{B.454}$$

Now define *e*<sup>λ</sup> = δλ (*a*). Then (B.290) - (B.291) give *e*<sup>2</sup> <sup>λ</sup> = *e*<sup>∗</sup> <sup>λ</sup> = *e*<sup>λ</sup> , so that *e*<sup>λ</sup> is a projection. Furthermore, since idσ*p*(*a*) · δλ = λ · δλ , eq. (B.290) gives *ae*<sup>λ</sup> = λ*e*<sup>λ</sup> , so that *e*λ*H* ⊆ *H*<sup>λ</sup> . Applying the map *f* → *f*(*a*) to (B.453) - (B.454) then yields (B.433) - (B.434), from which the equality *e*λ*H* = *H*<sup>λ</sup> follows *a fortiori*.

If dim(*H*) = ∞ and *a* ∈ *B*0(*H*)sa, this still works for each nonzero λ ∈ σ*p*(*a*), and since the sum (B.453) converges uniformly in *C*(σ(*a*)), we obtain (B.433) in the same way, including its norm-convergence. Unfortunately, even if we replace σ*p*(*a*) by σ(*a*), as we should, eq. (B.454) now fails, even pointwise, so that (B.434) still requires the kind of proof we gave (or a complicated argument based on (B.327)).

#### B.20 The trace

For finite-dimensional *H* the trace was defined by (A.77). There are (at least) two difficulties in generalizing this expression to the infinite-dimensional case in the naive way. First, not every operator has a finite trace; for example, take *a* = 1*H*, so that Tr(1*H*) = dim(*H*). Second, Lemma A.25 is no longer valid in general: it is easy to find an operator *a* ∈ *B*(*H*) and bases (υ*i*) and (υ *<sup>i</sup>*) of *H* for which

$$
\sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle \neq \sum\_{i} \langle \mathfrak{v}'\_{i}, a\mathfrak{v}'\_{i} \rangle,
$$

typically because one of these expressions converges, whereas the other diverges. For example, take *<sup>a</sup>* <sup>=</sup> <sup>∑</sup>*i*(−1)*<sup>i</sup>* <sup>|</sup>υ*i*υ*i*<sup>|</sup> as a strong limit, i.e., *<sup>a</sup>*<sup>ψ</sup> <sup>=</sup> <sup>∑</sup>*i*(−1)*<sup>i</sup>* υ*i*,ψυ*i*; this lies in *H* by Theorem B.61, from which (B.214) shows that *a*ψ = ψ. Take υ <sup>1</sup> = (υ<sup>1</sup> +υ2)/ √ 2, υ <sup>2</sup> = (υ<sup>1</sup> −υ2)/ √ 2, υ <sup>3</sup> = (υ<sup>3</sup> +υ4)/ √ 2, υ <sup>4</sup> = (υ<sup>3</sup> −υ4)/ √ 2, etc. Then <sup>∑</sup>*i*υ*i*,*a*υ*i* <sup>=</sup> <sup>∑</sup>*i*(−1)*<sup>i</sup>* diverges, whereas <sup>∑</sup>*i*<sup>υ</sup> *<sup>i</sup>* ,*a*υ *<sup>i</sup>* = ∑*<sup>i</sup>* 0 = 0.

However, if *a* ∈ *B*(*H*) is *positive*, i.e., *a* ≥ 0 in the usual sense that ψ,*a*ψ ≥ 0 for each ψ ∈ *H*, then we will show that for any two bases (υ*i*) and (υ *<sup>i</sup>*) of *H*,

$$
\sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle = \sum\_{i} \langle \mathfrak{v}\_{i}^{\prime}, a\mathfrak{v}\_{i}^{\prime} \rangle \tag{B.455}
$$

*where both sides may be infinite*. Equivalently, (A.79) is valid, since any unitary operator defines and is defined by a basis transformation. To prove (B.455), we need a very useful construction of independent interest, cf. (A.73).

Lemma B.141. *Any positive operator a* ∈ *B*(*H*) *has a (unique)* square root*, i.e., a positive operator* <sup>√</sup>*<sup>a</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*a*) *that satisfies* <sup>√</sup>*<sup>a</sup>* <sup>2</sup> = *a.*

*Proof.* This follows from Theorem B.94, since if *<sup>a</sup>* <sup>≥</sup> 0, then <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup>+, and hence √· is defined on <sup>σ</sup>(*a*). Alternatively, one may use the following construction due to the Dutch mathematician C. Visser (which is a special case of the approach just mentioned). If necessary, first rescale *a* so that *a* ≤ 1, take the power series for

$$\sqrt{1-\mathbf{x}} = \sum\_{k\geq 0} t\_k \mathbf{x}^k,\tag{\mathbf{B.456}}$$

(in which *t*<sup>0</sup> = 1), which converges absolutely for |*x*| ≤ 1, and put

$$\sqrt{a} = \sum\_{k\geq 0} t\_k (1\_H - a)^k. \tag{B.457}$$

As in the numerical case, squaring the series and rearranging terms yields √*a* <sup>2</sup> = *a*. Since uniqueness will not be needed, we omit the proof. -

For *a* ≥ 0, we now use (B.215) to compute

618 B Basic functional analysis

$$\begin{split} \sum\_{i} \langle \mathsf{v}\_{i}, a\mathsf{v}\_{i} \rangle &= \sum\_{i} \langle \sqrt{a}\mathsf{v}\_{i}, \sqrt{a}\mathsf{v}\_{i} \rangle = \sum\_{i,j} \langle \sqrt{a}\mathsf{v}\_{i}, \mathsf{v}\_{j}' \rangle \langle \mathsf{v}\_{j}', \sqrt{a}\mathsf{v}\_{i} \rangle \\ &= \sum\_{i,j} \langle \sqrt{a}\mathsf{v}\_{j}', \mathsf{v}\_{i} \rangle \langle \mathsf{v}\_{i}, \sqrt{a}\mathsf{v}\_{j}' \rangle = \sum\_{j} \langle \mathsf{v}\_{j}', a\mathsf{v}\_{j}' \rangle, \end{split} \tag{B.458}$$

where each term in every sum is positive, so that rearrangements are valid. Let

$$B(H)\_{+} = \{ a \in B(H) \mid a \ge 0 \};\tag{B.459}$$

In view of (B.458), we have a well-defined map

$$\text{Tr}: \mathcal{B}(H)\_{+} \to [0, \ast]; \tag{\mathbb{B}.460}$$

$$\operatorname{Tr}\left(a\right) = \sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle,\tag{B.461}$$

where (υ*i*) is an arbitrary basis of *H*, of which the result is independent by (B.455).

To drop the restriction *a* ≥ 0 in the argument of the trace, for *any a* ∈ *B*(*H*) we note that *a*∗*a* ≥ 0, so that we may define the *absolute value* |*a*| of *a* by

$$|a| = \sqrt{a^\*a}.\tag{\mathbb{B}.462}$$

Then |*a*| ≥ 0 for all *a* by construction, and if *a* ≥ 0, then |*a*| = *a*. Finally, we define the set of *trace-class operators* in *B*(*H*), later seen to be a Banach space, as

$$B\_1(H) = \{ a \in B(H) \mid \text{Tr}(|a|) < \infty \}. \tag{B.463}$$

The *trace-norm* of *a* ∈ *B*1(*H*), which for now is just a formula, is given by

$$\|\|a\|\|\_{1} = \text{Tr}\left(|a|\right),\tag{B.464}$$

Lemma B.142. *1. For any a* ∈ *B*1(*H*) *we have*

$$\|a\| \le \|a\|\_1. \tag{B.465}$$


Part 4 will shortly be improved to *B*1(*H*) actually being a Banach space.

Let us note that Lemma A.28 and Proposition A.29 on the polar decomposition remain valid for infinite-dimensional Hilbert space, with essentially the same proof.

*Proof.* 1. By definition of the operator norm (B.227), for every ε > 0 there is a unit vector <sup>ψ</sup> <sup>∈</sup> *<sup>H</sup>* such that for any *<sup>b</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*) one has *b*<sup>2</sup> ≤ *b*ψ<sup>2</sup> <sup>+</sup><sup>ε</sup> (proof by contradiction). Put *<sup>b</sup>* = (*a*∗*a*)1/4, and note that (*a*∗*a*)1/4<sup>2</sup> <sup>=</sup> |*a*| <sup>=</sup> *a* by (C.2) and (A.93). Completing ψ to a basis (υ*i*), and noting that

$$\sum\_{i} \|(a^\*a)^{1/4}\mathfrak{v}\_i\|^2 = \sum\_{i} \langle (a^\*a)^{1/4}\mathfrak{v}\_i, (a^\*a)^{1/4}\mathfrak{v}\_i \rangle = \sum\_{i} \langle \mathfrak{v}\_i, |a|\mathfrak{v}\_i \rangle = \|a\|\_1,\tag{\text{B.466}}$$

$$\|\|a\|\| = \|(a^\*a)^{1/4}\|^2 \le \|(a^\*a)^{1/4}\Psi\|^2 + \mathfrak{e} \le \sum\_{l} \|(a^\*a)^{1/4}\mathfrak{d}\_l\|^2 + \mathfrak{e} = \|a\|\|\mathfrak{e} + \mathfrak{e}.$$

Since this holds for all ε ≥ 0, one has (B.465).

2. Let *a* ∈ *B*1(*H*). Since ∑*i*υ*i*,|*a*|υ*i* < ∞, for each ε > 0 we can find *n* such that ∑*i*>*n*υ*i*,|*a*|υ*i* < ε. Let *en* be the projection onto the linear span of {υ*i*}*i*=1,...,*n*. Using (C.2) in the form *a*<sup>2</sup> <sup>=</sup> *aa*∗ (which is valid by (A.22)) and (B.465)),

$$\|\|e\_n^\perp|a|^{1/2}\|^2 = \|\|e\_n^\perp|a|e\_n^\perp\| \le \|e\_n^\perp|a|e\_n^\perp\|\_1 = \sum\_l \langle \mathfrak{v}\_l, e\_n^\perp|a|e\_n^\perp\mathfrak{v}\_l\rangle = \sum\_{l>n} \langle \mathfrak{v}\_l, |a|\mathfrak{v}\_l\rangle < \mathfrak{e}\_n$$

for |(*e*<sup>⊥</sup> *<sup>n</sup>* |*a*|*e*<sup>⊥</sup> *<sup>n</sup>* )| = *e*<sup>⊥</sup> *<sup>n</sup>* |*a*|*e*<sup>⊥</sup> *<sup>n</sup>* , for if *c* ≥ 0 then *b*∗*cb* ≥ 0 for any *b*, *c* ∈ *B*(*H*). Since *e*⊥ *<sup>n</sup>* = 1−*en*, it follows that *en*|*a*| <sup>1</sup>/<sup>2</sup> → |*a*<sup>|</sup> <sup>1</sup>/<sup>2</sup> in the norm topology. Since each operator *en*|*a*| <sup>1</sup>/<sup>2</sup> obviously has finite rank, <sup>|</sup>*a*<sup>|</sup> <sup>1</sup>/<sup>2</sup> and hence <sup>|</sup>*a*<sup>|</sup> is compact. Finally, *a* has polar decomposition *a* = *u*|*a*| and *B*0(*H*) is a two-sided ideal in *B*(*H*).

3. We just showed that *a* is compact. By Theorem B.130, also *a*∗*a* is compact, and since it is self-adjoint, Theorem B.136 applies. This gives an expansion (A.101); although the sum may be infinite, this is no problem, as it is norm-convergent. Thus the computation will be analogous to the finite-dimensional case, cf. Proposition A.30, expect that we cannot use (A.78), which is valid but has not been proved yet. Fortunately, this problem may be obviated using (A.94). It follows from Lemma A.28 and Proposition A.29 that (υ *<sup>i</sup>* = *u*υ*i*) also forms an orthonormal set, like the υ*<sup>i</sup>* themselves, since the closed linear space spanned by the unit vectors υ*<sup>i</sup>* is just (ran|*a*|)<sup>−</sup> and *u* is unitary from this space onto its image (ran*a*)−. Taking the trace over any basis that contains the vectors υ *<sup>i</sup>* , we compute

$$\begin{aligned} |\operatorname{Tr}(ab)| &= |\operatorname{Tr}(\boldsymbol{u}|a|\boldsymbol{u}^\*\boldsymbol{u}b)| = |\sum\_{i} p\_i \langle \boldsymbol{\upsilon}\_i^{\prime}, \boldsymbol{\mu}b\boldsymbol{\upsilon}\_i^{\prime} \rangle| \\ &\leq \sum\_{i} p\_i |\langle \boldsymbol{\upsilon}\_i^{\prime}, \boldsymbol{\mu}b\boldsymbol{\upsilon}\_i^{\prime} \rangle| \leq \sum\_{i} p\_i ||b|| ||\boldsymbol{\mu}|| ||\boldsymbol{\upsilon}\_i|| = ||a||\_1 ||b||, \quad (\mathbf{B.467}), \end{aligned}$$

where we used *a*<sup>1</sup> = ∑*<sup>i</sup> pi*, which follows from (A.101) applied to |*a*|. 4. Let *a*,*b* ∈ *B*1(*H*), and let *a*+*b* = *u*|*a*+*b*| be the polar decomposition. Then

$$\|\|a+b\|\|\_1 = \text{Tr}\left(\mu^\*(a+b)\right) = \text{Tr}\left(\mu^\*a\right) + \text{Tr}\left(\mu^\*b\right).$$

Applying (A.100) with *u*∗ ≤ 1, one has *a*+*b*<sup>1</sup> ≤ *a*<sup>1</sup> +*b*1. Hence *B*1(*H*) is a vector space and ·<sup>1</sup> satisfies the triangle inequality. The other axioms for a norm are obviously satisfied. -

Proposition B.143. *Let H* = -<sup>2</sup>(N) *(or even* -<sup>2</sup>(*X*)*, for any* countable *set X ), and for f* ∈ -<sup>∞</sup>(N)*, define the corresponding multiplication operator mf by mf*ψ = *f*ψ*, cf. Proposition B.73. We have seen that mf is bounded, with norm* (B.239)*. Then:*

> *mf* ∈ *B*0(*H*) *iff f* ∈ -<sup>0</sup>(N); (B.468)

$$\forall m\_f \in B\_1(H) \text{ iff } f \in \ell^1(\mathbb{N});\tag{B.469}$$

$$\|m\_f\|\_1 = \|f\|\_1. \tag{B.470}$$

*Here* -<sup>0</sup>(N) *consists of all f* : N → C *for which* lim*x*→<sup>∞</sup> *f*(*x*) = 0*. In particular, If* dim(*H*) = ∞ *we have* proper *inclusions*

$$B\_1(H) \subset B\_0(H) \subset B(H). \tag{B.471}$$

*Proof.* 1. For any *a* ∈ *B*(*H*) we have *a* ∈ *B*0(*H*) iff |*a*| ∈ *B*0(*H*) by the polar decomposition (since *a* = *u*|*a*| and |*a*| = *u*∗*a* and *B*0(*H*) is a two-sided ideal in *B*(*H*)). In the present case, we have |*mf* | = *m*∗ *<sup>f</sup> mf* <sup>=</sup> <sup>√</sup>*m*<sup>|</sup> *<sup>f</sup>* <sup>|</sup> <sup>2</sup> = *m*<sup>|</sup> *<sup>f</sup>* <sup>|</sup>, whence *mf* ∈ *B*0(*H*) iff *m*<sup>|</sup> *<sup>f</sup>* <sup>|</sup> ∈ *B*0(*H*). Since σ*p*(*m*<sup>|</sup> *<sup>f</sup>* <sup>|</sup>) = {| *f*(*x*)|, *x* ∈ N}, part 6 of Theorem B.136 applied to *a* = *m*<sup>|</sup> *<sup>f</sup>* <sup>|</sup> states that *f* ∈ -<sup>0</sup>(N).

2. This rapidly follows by computing Tr(|*mf* |) = Tr(*m*<sup>|</sup> *<sup>f</sup>* <sup>|</sup>) in the basis υ*<sup>x</sup>* = δ*x*, *x* ∈ N, where δ*x*(*y*) = δ*xy*, as usual. -

Proposition B.144. *The map*

$$\text{Tr} \quad : \text{ } \mathcal{B}\_1(H) \to \mathbb{C}; \tag{B.472}$$

$$a \mapsto \sum\_{i} \langle \mathfrak{v}\_{i}, a\mathfrak{v}\_{i} \rangle,\tag{B.473}$$

*where* (υ*i*) *is some basis of H, is well defined, (obviously) linear, and independent of the choice of basis. Furthermore,* (A.78)*, i.e.,* Tr(*ab*) = Tr(*ba*)*, holds.*

*Proof.* Taking *a* = 1*<sup>H</sup>* in (A.100), we have |Tr(*a*)|≤*a*<sup>1</sup> < ∞ for *a* ∈ *B*1(*H*). Independence of the choice of basis follows by first decomposing *a* = *a* + *ia*, with *a* = <sup>1</sup> <sup>2</sup> (*a*+*a*∗) and *a* = −<sup>1</sup> <sup>2</sup> *i*(*a*−*a*∗) self-adjoint, as usual, and subsequently using Theorem B.132 to write *a* = *a* <sup>+</sup> −*a* <sup>−</sup>, with

$$a'\_{\pm} = \pm \sum\_{\substack{\lambda \in \sigma\_{\mathcal{P}}(d') \cap \mathbb{R}\_{\pm}}} \lambda \cdot e\_{\lambda}, \tag{B.474}$$

and likewise for *a*. This makes *a* is a linear combination of four positive operators, whence the claim follows from (B.458) and the obvious linearity of (B.473).

To establish (A.78), we first note that Tr(*au*) = Tr(*ua*) for any unitary *u*; this is the same as (A.79), which has just been proved. The claim then follows from the following (generally useful) lemma. -

Lemma B.145. *Any a* ∈ *B*(*H*) *is a linear combination of at most four unitaries.*

*Proof.* By the previous argument, we may assume that *a*∗ = *a*, and for convenience we also assume that *a* ≤ 1. In that case, *a*ψ≤ψ and hence 1 <sup>−</sup> *<sup>a</sup>*<sup>2</sup> <sup>≥</sup> 0, so that <sup>√</sup> 1−*a*<sup>2</sup> is defined, cf. Lemma B.141. Defining the two operators

$$
\mu\_{\pm} = a \pm i\sqrt{1 - a^2},\tag{B.475}
$$

we find *u*∗ <sup>±</sup>*u*<sup>±</sup> = *u*±*u*<sup>∗</sup> <sup>±</sup> <sup>=</sup> <sup>1</sup>*H*, making each *<sup>u</sup>*<sup>±</sup> unitary, and *<sup>a</sup>* <sup>=</sup> <sup>1</sup> <sup>2</sup> (*u*<sup>+</sup> +*u*−). If *a* = *a*∗, the number of terms at most doubles. -

The deeper significance of the trace-class operators now emerges.

Theorem B.146. *For any Hilbert space H, we have dualities and double dualities*

$$B\_0(H)^\* \cong B\_1(H);\tag{B.476}$$

$$B\_1(H)^\* \cong B(H);\tag{B.477}$$

$$B\_0(H)^{\*\*} \cong B(H);\tag{B.478}$$

$$B\_1(H)^{\*\*} \cong B(H)^\*,\tag{B.479}$$

*where the symbol* ∼= *stands for* isometric isomorphism*. Explicitly:*

• *Any norm-continuous linear map* ω : *B*0(*H*) → C *takes the form*

$$a(b) = \text{Tr}(ab),\tag{B.480}$$

*for some a* ∈ *B*1(*H*) *uniquely determined by* ω*, and* vice versa*, giving a bijective correspondence between* ω ∈ *B*0(*H*)<sup>∗</sup> *and a* ∈ *B*1(*H*) *satisfying*

$$\|\|a\|\| = \|a\|\_1. \tag{B.481}$$

*This equality remains valid if* ω *is regarded as an element of B*(*H*)∗ *via* (B.479) *and the isometric embedding B*1(*H*) → *B*1(*H*)∗∗ *(cf. Proposition B.44).*

• *Any norm-continuous linear map* χ : *B*1(*H*) → C *takes the form*

$$\mathcal{X}(a) = \text{Tr}\,(ab),\tag{\text{B.482}}$$

*for some b* ∈ *B*(*H*) *uniquely determined by* χ*, and* vice versa*, giving a bijective correspondence between* χ ∈ *B*1(*H*)<sup>∗</sup> *and b* ∈ *B*(*H*) *satisfying*

$$\|\mathcal{X}\| = \|b\|. \tag{B.483}$$

*Proof.* It is clear from (A.100) that *B*1(*H*) ⊆ *B*0(*H*)∗, with ω≤*a*1. For the opposite direction, we return to the projections *en* in the proof of part 2 of Lemma B.142. Taking the trace over the basis (υ*i*), we have

$$\begin{aligned} ||a||\_1 &= \operatorname{Tr}(|a|) = \lim\_{n} \operatorname{Tr}(e\_n |a| e\_n) = \lim\_{n} \operatorname{Tr}(e\_n |a|) = \lim\_{n} \operatorname{Tr}(e\_n u^\* a) \\ &= \lim\_{n} \boldsymbol{\alpha}(e\_n u^\*); \end{aligned} \tag{B.484}$$

since ω(*enu*∗) ≥ 0, we have ω(*enu*∗) ≤ ω*enu*∗≤ω, whence *a*<sup>1</sup> ≤ ω (note that the limiting procedure is necessary here, since ω(*u*∗) would not be defined because typically *u*∗ is not compact). This proves (B.481).

To prove (B.476), it remains to be shown that every ω ∈ *B*0(*H*)<sup>∗</sup> can be represented as (B.480). Noting that *B*0(*H*) is the norm-closure of the linear span of all operators of the sort *a* = |ψϕ|, where ψ,ϕ ∈ *H* are unit vectors, the functional ω is determined by its values on those operators. Given ω, we define *a* by its matrix elements ϕ,*a*ψ = ω(|ψϕ|). Evaluating the trace on a basis containing ϕ yields Tr(*a*|ψϕ|) = ϕ,*a*ψ and hence gives (B.480) on operators *a* of the said form, upon which the general case follows by continuity.

We now prove (B.477). As in the previous case, the inclusion *B*(*H*) ⊂ *B*1(*H*)<sup>∗</sup> is clear from (A.100), as is the inequality χ≤*a*. This time, the proof of the opposite inequality uses *a* = |ψϕ|, in which case one easily obtains

$$\|\|\Psi\rangle\langle\Phi\rangle\|\_{\mathrm{l}} = \|\|\Psi\|\|\|\Phi\rangle\|,\tag{B.485}$$

which in the case of unit vectors equals unity. Assuming (B.482), this gives

$$|\mathcal{X}(b)| = |\mathcal{X}(|\Psi\rangle\langle\Phi|)| = |\text{Tr}(|\Psi\rangle\langle\Phi|b)| = |\langle\Phi, b\Psi\rangle| \le \|\mathcal{X}\|\|\|\Psi\rangle\langle\Phi|\|\_1 = \|\mathcal{X}\|.\tag{\text{B.486}}$$

Combined with (B.228), this gives *b*≤χ, and hence (B.483).

Finally, as in the previous case, given χ, we find *b* though its matrix elements ϕ,*b*ψ = χ(|ψϕ|), which gives (B.482) on the special trace-class operators defined by *a* = |ψϕ|. Noting that the linear span of such operators in dense (in the trace-norm) in *B*1(*H*), once again this gives the general case by continuity. -

Corollary B.147. *1. The vector space B*1(*H*) *is complete in the norm* (B.464)*. 2. B*1(*H*) *is a two-sided ideal in B*(*H*) *(a* ∈ *B*(*H*),*b* ∈ *B*1(*H*) ⇒ *ab* ∈ *B*1(*H*) *ba).*

*Proof.* The first claim follows from (B.476) and the completeness of *B*0(*H*)<sup>∗</sup> (cf. Theorem B.33 and §B.9). The second follows from (A.100) and (A.78). -

This actually reveals a subtlety in (B.471): as a normed space, *B*0(*H*) simply inherits the norm of *B*(*H*), in which it is complete. Clearly, *B*1(*H*) also inherits the norm of *B*(*H*), but that is the wrong one: firstly, *B*1(*H*) is not complete in the operator norm (indeed, its completion is *B*0(*H*)), and secondly, the operator norm is the wrong one for the fundamental dualities stated in Theorem B.146.

The following trace-class operators occupy the center stage in quantum theory.

Definition B.148. *A* density operator *is a positive operator* ρ ∈ *B*1(*H*) *such that*

$$\text{Tr}(\mathfrak{p}) = 1.\tag{\text{B.487}}$$

Equivalently, ρ is a density operator iff it has a norm-convergent expansion

$$\rho = \sum\_{\lambda \in \sigma\_{\rho}(\rho)} \lambda \cdot e\_{\lambda},\tag{B.488}$$

where σ*p*(ρ) is some countable subset of R<sup>+</sup> with 0 as its only possible accumulation point, the multiplicity *m*<sup>λ</sup> = dim(*H*<sup>λ</sup> ) of each eigenvalue λ > 0 is finite, and

$$\sum\_{\lambda \in \sigma\_p(\lambda)} \lambda \cdot m\_{\lambda} = 1. \tag{B.489}$$

Similarly, (2.6) holds just as in finite dimension, i.e., (B.488) is equivalent to

$$\rho = \sum\_{i} p\_{i} |\mathfrak{v}\_{i}\rangle\langle\mathfrak{v}\_{i}|,\tag{B.490}$$

where (υ*i*) is a basis of *H*, and the coefficients (*pi*) satisfy *pi* > 0 and ∑*<sup>i</sup> pi* = 1. Furthermore, the *pi* have 0 as their only possible accumulation point and are such that each *t* > 0 occurs in the set {*pi*} at most finitely many times. Like (B.488), also the equivalent expansion (B.490) is norm-convergent by Theorem B.136.

Definition B.149. *Let H be a separable Hilbert space. An operator a* ∈ *B*(*H*) *is called a* Hilbert–Schmidt operator *if for some (and hence any) basis* (υ*i*) *of H,*

$$\sum\_{l} \left\| a \mathbf{\dot{v}}\_{l} \right\|^{2} < \infty,\tag{\text{B.491}}$$

*We write B*2(*H*) *for the set of all Hilbert–Schmidt operators on H.*

The argument that the sum in (B.491) is independent of the basis is based on (B.215) and is analogous to the computation (B.458), thjis time even without the complication of the square root, for we simply have <sup>∑</sup>*<sup>i</sup> a*υ*i*<sup>2</sup> <sup>=</sup> <sup>∑</sup>*ia*υ*i*,*a*υ*i*, etc. For *a* ∈ *B*2(*H*), with foresight we define the expression (where (υ*i*) is any basis of *H*):

$$||a||\_2 = \sqrt{\operatorname{Tr}\left(a^\*a\right)} = \left(\sum\_l ||a\upsilon\_l||^2\right)^{1/2}.\tag{B.492}$$

Theorem B.150. *Let H be a separable Hilbert space.*

*1. For any a* ∈ *B*(*H*) *we have*

$$\|\|a\|\| \le \|a\|\|\_{2} \le \|a\|\_{1}. \tag{B.493}$$

*2. Every Hilbert–Schmidt operator is compact, and refining* (B.471) *one has*

$$B\_1(H) \subset B\_2(H) \subset B\_0(H). \tag{B.494}$$

*3. The Hilbert–Schmidt operators B*2(*H*) *form a Hilbert space with inner product*

$$\langle a, b \rangle\_2 = \text{Tr} \left( a^\* b \right), \tag{B.495}$$

*and a Banach space in the ensuing norm* (A.2)*, which equals (B.492). Clearly,*

$$B\_2(H)^\* \cong B\_2(H). \tag{B.496}$$


624 B Basic functional analysis

$$\|\|a\|\|\_{2}^{2} = \sum\_{i} \mu\_{i} \le \left(\sum\_{i} \sqrt{\mu\_{i}}\right)^{2} = \|\|a\|\|\_{1}^{2},\tag{B.497}$$

where the μ*<sup>i</sup>* ≥ 0 are the eigenvalues of the positive compact operator *a*∗*a*; the eigenvalues of the compact operator <sup>|</sup>*a*<sup>|</sup> <sup>=</sup> <sup>√</sup>*a*∗*<sup>a</sup>* are ( √μ*i*).

3. We first show that *B*2(*H*) is a vector space. For any *a*,*b* ∈ *B*(*H*) we have

$$\mathcal{Z}(a^\*a + b^\*b) = (a+b)^\*(a+b) + (a-b)^\*(a-b),\tag{B.498}$$

so that (*<sup>a</sup>* <sup>+</sup> *<sup>b</sup>*)∗(*<sup>a</sup>* <sup>+</sup> *<sup>b</sup>*) <sup>≤</sup> <sup>2</sup>(*a*∗*<sup>a</sup>* <sup>+</sup> *<sup>b</sup>*∗*b*) and hence *<sup>a</sup>* <sup>+</sup> *<sup>b</sup>*<sup>2</sup> <sup>2</sup> <sup>≤</sup> <sup>2</sup>(*a*<sup>2</sup> <sup>+</sup> *b*<sup>2</sup> 2. Therefore, if *a*,*b* ∈ *B*2(*H*), then *a* + *b* ∈ *B*2(*H*). Since λ*a*<sup>2</sup> = |λ|*a*2, it is clear that if *a* ∈ *B*2(*H*), then λ*a* ∈ *B*2(*H*). Hence *B*2(*H*) is a vector space. Furthermore, because of the identity

$$a^\*b = \frac{1}{4} \sum\_{k=0}^3 i^k (b + i^k a)^\* (b + i^k a),\tag{B.499}$$

the inner product (B.495) may be rewritten as

$$\langle a, b \rangle\_2 = \sum\_{i} \langle e\_i, a^\* b e\_i \rangle = \frac{1}{4} \sum\_{k=0}^{3} i^k ||(b + i^k a)||\_2^2,\tag{B.500}$$

which shows that if *a*,*b* ∈ *B*2(*H*), then *a*,*b*<sup>2</sup> < ∞. This reconfirms the fact that the trace in (B.495) may be computed in any basis, since this is true for each term on the right-hand side of (B.500). Sesquilinearity of (B.495) is straightforward.

To prove positive definiteness, we use part 1: if *a*<sup>2</sup> = 0, then *a* = 0 and hence *a* = 0, since we already know that · is a norm.

Knowing that (B.495) is an inner product on *B*2(*H*), it immediately follows that ·<sup>2</sup> is a norm on *B*2(*H*), since, as already noted, *a*<sup>2</sup> = *a*,*a*2.

Finally, to prove completeness, we pick a basis (υ*i*) in *H* and note that *B*2(*H*) is the closure of the linear span of all operators of the form *a* = ∑*i*, *<sup>j</sup> ci j*|υ*i*υ*i*|. This is because of the continuity of the inclusion *B*2(*H*) ⊂ *B*0(*H*) (which is true because of part 1 and the fact that *B*0(*H*) is itself the closure of this linear span). An easy calculation then gives

$$\|\|a\|\|\_{2} = \|\sum\_{i,j} c\_{ij} |\mathfrak{v}\_{i}\rangle\langle\mathfrak{v}\_{i}|\|\_{2}^{2} = \sum\_{i,j} |c\_{ij}|^{2}.\tag{\mathbf{B}.501}$$

Hence *B*2(*H*) is isometrically isomorphic to the space of square-summable sequences (*ci j*) indexed by N × N, which by Theorem B.9 is complete in the -2 norm *c*<sup>2</sup> <sup>2</sup> = ∑*i*, *<sup>j</sup>* |*ci j*| 2. Hence *B*2(*H*) is complete, too.

4. From (A.78) (proved in Proposition B.144) we have Tr(*a*∗*a*) = Tr(*aa*∗), so that *a* ∈ *B*2(*H*) iff *a*<sup>∗</sup> ∈ *B*2(*H*). If *b* ∈ *B*(*H*) and *a* ∈ *B*2(*H*), then *ba*υ*i*≤*ba*υ*i* and hence *ba*<sup>2</sup> ≤ *ba*2, so *ba* ∈ *B*2(*H*), and hence also *a*<sup>∗</sup> ∈ *B*2(*H*) and *a*∗*b*<sup>∗</sup> ∈ *B*2(*H*). Similarly, *ab* ∈ *B*2(*H*), with *ab*<sup>2</sup> ≤ *ba*2. -

#### B.21 Spectral theory for unbounded self-adjoint operators

Although there is hardly any distinction between bounded and unbounded selfadjoint operators in so far as the definition and elementary properties of the spectrum are concerned (cf. Definitions B.80 and B.85, Theorem B.91, and Theorem B.93), extending the various versions of the spectral theorem to the unbounded case is a highly nontrivial matter. There are many ways of accomplishing this, among which our presentation has the virtues that firstly (in contrast to von Neumann's original approach based on the Cayley transform) we stay within the realm of self-adjointness, and secondly we preserve the C\*-algebraic spirit of Theorem B.94. Thirdly, our treatment is sufficiently general to cover the two main applications in quantum mechanics (viz. the Born rule and Stone's Theorem). For those applications, setting up a functional calculus for bounded Borel functions suffices, but in order to state even the defining property idσ(*a*) → *a* of the functional calculus also for unbounded *a* (cf. Theorem B.94), unbounded *continuous* functions will also have to be incorporated (but we refrain from a further generalization to unbounded *Borel* functions).

Our approach starts from the observation that (with slight abuse of notation)

$$\mathbf{y} : \mathbb{R} \to (-1, 1); \tag{\mathbb{B}.\\$02}$$

$$\mathbf{y}(\mathbf{x}) = \mathbf{x}(1+\mathbf{x}^2)^{-1/2};\tag{\mathbf{B.503}}$$

$$\mathbf{y}^{-1}(\mathbf{x}) = \mathbf{x}(1-\mathbf{x}^2)^{-1/2},\tag{\mathbf{B.504}}$$

provides a homeomorphism R ∼= (−1,1). This has an operatorial counterpart

$$a \mapsto a(1\_H + a^2)^{-1/2} \equiv b;\tag{\text{B.505}}$$

$$b \mapsto b(1\_H - b^2)^{-1/2} \equiv a,\tag{B.506}$$

where the notation for the square roots should be carefully disambiguated as

$$(\left(1\_H + a^2\right)^{-1/2} \equiv (\left(1\_H + a^2\right)^{-1})^{1/2};\tag{B.507}$$

$$(1\_H - b^2)^{-1/2} \equiv ((1\_H - b^2)^{1/2})^{-1}.\tag{B.508}$$

As we shall see, the operator(1*<sup>H</sup>* <sup>+</sup>*a*2)−<sup>1</sup> is bounded (and so is 1*<sup>H</sup>* <sup>−</sup>*b*∗*b*), of course), so that square roots are only taken of *bounded* operators, in which case they are defined by Lemma B.141. As in the numerical case (B.503), the correspondence *a* ↔ *b* in (B.505) - (B.506) will turn out to be bijective, mapping the class of (possibly unbounded) self-adjoint operators into the class of self-adjoint pure contractions:

#### Definition B.151. *A* pure contraction *is a bounded operator b* : *H* → *H for which*

$$\|\|b\Psi\|\| < \|\|\Psi\|\| \ (\Psi \in H\backslash\{0\}).\tag{B.509}$$

If *b* is in addition self-adjoint, this is equivalent to *b* ≤ 1 and ker(*b*±1*H*) = {0}, i.e., ±1 ∈/ σ*p*(*b*); the argument is similar to the proof of Lemma B.137.

Eqs. (B.505) - (B.506) form a special case of a more general correspondence.

Theorem B.152. *The formal expressions*

$$b = a(1\_H + a^\*a)^{-1/2} \equiv a((1\_H + a^\*a)^{-1})^{1/2};\tag{B.510}$$

$$a = b(1\_H - b^\*b)^{-1/2} \equiv b((1\_H - b^\*b)^{1/2})^{-1},\tag{B.S.11}$$

*make rigorous sense and define a bijective correspondence between the class of* closed operators *a (with dense domain) and the class of (necessarily bounded)* pure contractions *b. This correspondence preserves the adjoint, in that*

$$b^\* = a^\*(1\_H + aa^\*)^{-1/2};\tag{\mathbb{B}.512}$$

$$a^\* = b^\*(1\_H - bb^\*)^{-1/2},\tag{\mathbb{B}.\ $1\$ },$$

*and hence specializes to a a bijective correspondence* (B.505) *-* (B.506) *between* self-adjoint operators *a and* self-adjoint pure contractions *b.*

*The (bounded) operator b is called the* bounded transform *of a.*

*Proof.* 1. *From b to a*. If *b* is a pure contraction, then 1*<sup>H</sup>* −*b*∗*b* ≥ 0, since this means

$$
\langle \Psi, b^\* b \Psi \rangle \le \langle \Psi, \Psi \rangle,\tag{B.514}
$$

or *b*ψ<sup>2</sup> ≤ ψ2. Furthermore, 1*<sup>H</sup>* <sup>−</sup> *<sup>b</sup>*∗*<sup>b</sup>* is injective, since (1*<sup>H</sup>* <sup>−</sup> *<sup>b</sup>*∗*b*)<sup>ψ</sup> <sup>=</sup> <sup>0</sup> implies ψ<sup>2</sup> <sup>=</sup> *b*ψ2, contradicting (B.509). This implies that (1*<sup>H</sup>* <sup>−</sup> *<sup>b</sup>*∗*b*)1/<sup>2</sup> is injective, as (1*<sup>H</sup>* <sup>−</sup>*b*∗*b*)1/2<sup>ψ</sup> <sup>=</sup> 0 implies (1*<sup>H</sup>* <sup>−</sup>*b*∗*b*)<sup>ψ</sup> <sup>=</sup> 0 and hence <sup>ψ</sup> <sup>=</sup> 0. Thus the inverse (B.508) exists, with domain

$$D(\left(1\_H - b^\*b\right)^{-1/2}) = \text{ran}(\left(1\_H - b^\*b\right)^{1/2}).\tag{B.515}$$

This domain in dense in *H*, since for any *c* ∈ *B*(*H*) (which in our case is *c* = (1*<sup>H</sup>* − *<sup>b</sup>*∗*b*)1/2) we have *<sup>H</sup>* <sup>=</sup> ker(*c*) <sup>⊕</sup> ker(*c*)⊥; for *<sup>c</sup>*<sup>∗</sup> <sup>=</sup> *<sup>c</sup>* we have ker(*c*) = ran(*c*)<sup>⊥</sup> and hence ker(*c*)⊥ = ran(*c*)−, so that injectivity of *c* yields *H* = ran(*c*)−. Hence (B.511) is well defined on

$$D(a) = \text{ran}((1\_H - b^\*b)^{1/2}).\tag{B.516}$$

To prove that *a* is closed, we write *a* = *bc*−1, as above, and note that

$$G(a) = \{ (b\Psi, c\Psi), \Psi \in H \} = \text{ran}(\nu), \tag{B.517}$$

where *v* : *H* → *H* ⊕*H* is obviously defined by *v*ψ = (*b*ψ, *c*ψ). Hence

$$\|\|\nu\Psi\|\|^2 = \|b\Psi\|^2 + \|c\Psi\|^2 = \|b\Psi\|^2 + \|(1\_H - b^\*b)^{1/2}\Psi\|^2 = \|\Psi\|^2,\quad(\text{B.518})$$

so that *v* is an isometry. As such, ran(*v*) = *G*(*a*) is closed. 2. *From a to b*. By definition, *D*(1*<sup>H</sup>* +*a*∗*a*) = *D*(*a*∗*a*), with

$$D(a^\*a) = \{ \Psi \in D(a) \mid a^\*\Psi \in D(a) \}. \tag{B.519}$$

We show that 1*<sup>H</sup>* +*a*∗*a* : *D*(*a*∗*a*) → *H* is bijective. First, (B.237) implies

$$H \oplus H = G(a) \oplus G(a)^\perp = G(a) \oplus \iota G(a^\*),\tag{B.520}$$

so for any (ψ1,ψ2) ∈ *H* ⊕*H* there are *unique* ϕ ∈ *D*(*a*) and χ ∈ *D*(*a*∗) such that

$$
\Psi \!\!\!/ = \!\!\!/ - a^\* \!\!\!/ \!\!\/> ; \tag{\text{B.521}}
$$

$$
\Psi \natural = a \spadesuit + \mathcal{X} . \tag{\text{B.522}}
$$

In particular, for (ψ1,ψ2)=(ψ,0) we obtain

$$
\Psi = (1\_H + a^\*a)\varPhi,\tag{\mathbb{B}.\523}
$$

This shows both surjectivity and injectivity, since ϕ is uniquely determined by ψ. Consequently, the inverse

$$(1\_H + a^\*a)^{-1} : H \to D(a^\*a) \tag{B.524}$$

exists as a linear map, and since

$$\|(1\_H + a^\*a)^{-1}\Psi\| = \|\varphi\| \le \|(1\_H + a^\*a)\Phi\| = \|\Psi\|,\tag{B.525}$$

we see that (1*<sup>H</sup>* <sup>+</sup>*a*∗*a*)−<sup>1</sup> is bounded, with (1*<sup>H</sup>* <sup>+</sup>*a*∗*a*)−1 ≤ 1. A similar argument shows that (1*<sup>H</sup>* +*a*∗*a*)−<sup>1</sup> is positive:

$$\langle \langle \Psi, (1\_H + a^\*a)^{-1} \Psi \rangle \rangle = \langle (1\_H + a^\*a)\Phi, \Phi \rangle = \||\Phi\||^2 + \||a\Phi\||^2 \ge 0,\qquad(\text{B.526})$$

so that the square root (B.507) exists. As before, injectivity of (1*<sup>H</sup>* + *a*∗*a*)−<sup>1</sup> implies injectivity of its square root, whence ran((1*<sup>H</sup>* +*a*∗*a*)−1/2) is dense in *H*. Clearly, (1*<sup>H</sup>* +*a*∗*a*)−1/<sup>2</sup> maps ran((1*<sup>H</sup>* +*a*∗*a*)−1/2) to

$$\text{ran}(\left(\mathbf{1}\_H + a^\*a\right)^{-1}) = D(a^\*a) \subseteq D(a),\tag{\mathbf{B}.527}$$

so that the operator *b* in (B.510) is defined on ran((1*<sup>H</sup>* +*a*∗*a*)−1/2). We now show that *b* is bounded on the latter: for any ψ ∈ *H* we have

$$\begin{aligned} \|b(1\_H + a^\*a)^{-1/2}\Psi\|^2 &= \|a(1\_H + a^\*a)^{-1}\Psi\|^2 \\ &= \langle (1\_H + a^\*a)^{-1}\Psi, a^\*a(1\_H + a^\*a)^{-1}\Psi \rangle \\ &\le \langle (1\_H + a^\*a)^{-1}\Psi, (1\_H + a^\*a)(1\_H + a^\*a)^{-1}\Psi \rangle \\ &= \langle (1\_H + a^\*a)^{-1}\Psi, \Psi \rangle = \|(1\_H + a^\*a)^{-1/2}\Psi\|^2,\tag{\bf B.528} \end{aligned}$$

so that *b* may be extended to all of *H* by continuity, with *b* ≤ 1. Still denoting this extension by *b*, we have

$$b^\*b = (1\_H + a^\*a)^{-1/2}a^\*a(1\_H + a^\*a)^{-1/2} = 1\_H - (1\_H + a^\*a)^{-1},\qquad(\text{B.529})$$

from which it easily follows that *b* is a pure contraction: for any ψ = 0, we have

$$\|b\Psi\|^2 = \langle \Psi, b^\*b\Psi \rangle = \|\Psi\|^2 - \|(1\_H + a^\*a)^{-1/2}\Psi\|^2 < \|\Psi\|^2,\qquad(\mathbf{B.530})$$

since 1*<sup>H</sup>* <sup>+</sup>*a*∗*a*)−1/2ψ<sup>2</sup> <sup>&</sup>gt; 0 by injectivity of (1*<sup>H</sup>* <sup>+</sup>*a*∗*a*)−1/2.

3. *Bijectivity of the correspondence a* ↔ *b.* If *a* is determined by *b* according to (B.511), then

$$1\_H + a^\*a = (1\_H - b^\*b)^{-1},\tag{B.531}$$

so that

$$(1\_H - b^\*b)^{1/2} = (1\_H + a^\*a)^{-1/2},\tag{B.532}$$

whence

$$b = b(1\_H - b^\*b)^{-1/2}(1\_H - b^\*b)^{1/2} = a(1\_H - b^\*b)^{1/2} = a(1\_H + a^\*a)^{-1/2}.\tag{B.533}$$

Similarly, if *b* is defined by *a* according to (B.510), then (B.529), rewritten as

$$1\_H - b^\*b = (1\_H + a^\*a)^{-1},\tag{B.534}$$

reproduces (B.511). To see that the domains match, in view of (B.516) we need

$$D(a) = \text{ran}((1\_H + a^\*a)^{-1/2}).\tag{B.535}$$

The inclusion *<sup>D</sup>*(*a*) <sup>⊇</sup> ran((1*<sup>H</sup>* <sup>+</sup> *<sup>a</sup>*∗*a*)−1/2) already having been established in step 2 above, we prove the opposite inclusion ⊆. Indeed, for any ψ ∈ *D*(*a*) we have

$$\Psi = (1\_H + a^\*a)^{-1/2} (b^\*a + (1\_H + a^\*a)^{-1/2}) \Psi,\tag{B.536}$$

where *b* is given by (B.510). This follows by taking inner products with ϕ ∈ *H*:

$$
\begin{split}
\langle \boldsymbol{\upphi}, (\mathbbm{1}\_{H} + \boldsymbol{a}^{\*}\boldsymbol{a})^{-1/2} \boldsymbol{b}^{\*} \boldsymbol{a} \boldsymbol{\uppsi} \rangle &+ \langle \boldsymbol{\upphi}, (\mathbbm{1}\_{H} + \boldsymbol{a}^{\*}\boldsymbol{a})^{-1} \boldsymbol{\uppsi} \rangle \\
\boldsymbol{\upphi} &= \langle \boldsymbol{a}^{\*} \boldsymbol{a} (\mathbbm{1}\_{H} + \boldsymbol{a}^{\*}\boldsymbol{a})^{-1} \boldsymbol{\upphi}, \boldsymbol{\uppsi} \rangle + \langle \boldsymbol{\upphi}, (\mathbbm{1}\_{H} + \boldsymbol{a}^{\*}\boldsymbol{a})^{-1} \boldsymbol{\uppsi} \rangle = \langle \boldsymbol{\upphi}, \boldsymbol{\uppsi} \rangle.
\end{split} \tag{B.537}
$$

4. *Self-adjointness.* Since *a* is closed we have *a*∗∗ = *a* (cf. Lemma B.74), so using *a*∗ instead of *a* in part 2 above, we have

$$(\left(1\_H + aa^\*\right)^{-1} : H \to D(aa^\*) \subset D(a^\*), \tag{B.538}$$

bijectively. If, in addition, ψ ∈ *D*(*a*∗), we may compute

$$a^\*\Psi = a^\*(1\_H + aa^\*)(1\_H + aa^\*)^{-1}\Psi = (1\_H + a^\*a)a^\*(1\_H + aa^\*)^{-1}\Psi,\quad(\text{B.539})$$

from which it follows that

$$a^\*(1\_H + aa^\*)^{-1} \Psi = (1\_H + a^\*a)^{-1} a^\* \Psi. \tag{B.540}$$

Similarly, for any polynomial *p* in one real variable we have

$$a^\* p((1\_H + aa^\*)^{-1})\Psi = p((1\_H + a^\*a)^{-1})a^\*\Psi. \tag{\mathbf{B.541}}$$

By Weierstrass, we can find polynomials *pn* such that

$$\lim\_{n \to \infty} p\_n((1+x)^{-1}) = (1+x)^{-1/2},\tag{B.542}$$

for any *x* ≥ 0, also cf. the proof of Lemma B.141. Hence by Theorem B.94 and closeness of *a*∗ we obtain

$$\begin{split} a^\*(1\_H + aa^\*)^{-1/2}\Psi &= (1\_H + a^\*a)^{-1/2}a^\*\Psi = (a(1\_H + a^\*a)^{-1/2})^\*\Psi\\ &= b^\*\Psi, \end{split} \tag{\text{B.543}}$$

for ψ ∈ *D*(*a*∗). Since the latter is dense, we have (B.512). Bijectivity of the correspondence *a* ↔ *b* then also implies (B.513). In particular, *a*<sup>∗</sup> = *a* iff *b*<sup>∗</sup> = *b*, which implies the last claim of the theorem. -

Though not needed in what follows, it would be a pity not to state:

Corollary B.153. *If a* : *D*(*a*) → *H is closed (with D*(*a*)<sup>−</sup> = *H), then:*

	- ι<sup>1</sup> : *H* → *H* ⊕*H is defined by* ι1ψ = (ψ,0)*;*
	- *eG*(*a*) : *H* ⊕*H* → *H* ⊕*H is the projection onto the graph G*(*a*)*;*
	- π<sup>1</sup> : *H* ⊕*H* → *H is the projection* π1(ψ1,ψ2) = ψ<sup>1</sup> *onto the first coordinate,*

*so that in total we duly have* π<sup>1</sup> ◦ *eG*(*a*) ◦ ι<sup>1</sup> : *H* → *H. 3. The closure of a*|*D*(*a*∗*a*) *is a (in other words, D*(*a*∗*a*) *is a core for a).*


$$(1\_H + a^\*a)\pi\_1 e\_{G(a)}\mathfrak{r}\_1 = 1\_H;\tag{\mathbb{B}.544}$$

$$
\pi\_1 e\_{G(a)} \mathfrak{l}\_1 (1\_H + a^\* a) = \mathfrak{l}\_H. \tag{\mathbb{B}.545}
$$

3. This is a consequence of the fact that ran(1*<sup>H</sup>* +*a*∗*a*) = *H*, cf. part 2 of the above proof, too. Indeed, we need to show that the graph of the restriction

$$G(a\_{\vert D(a^\*a)}) = \{ (\Psi, a\Psi), \Psi \in D(a^\*a) \}\tag{B.546}$$

is dense in the grapg *G*(*a*) = {(ψ,*a*ψ),ψ ∈ *D*(*a*)} within *H* ⊕*H*. In other words, if ψ ∈ *G*(*a*) satisfies Φ,ψ*H*⊕*<sup>H</sup>* = 0 for each Φ ∈ *G*(*a*|*D*(*a*∗*<sup>a</sup>*)), then ψ = 0. With ψ = (ψ,*a*ψ) and Φ = (ϕ,*a*ϕ), where ψ ∈ *D*(*a*) and ϕ ∈ *D*(*a*∗*a*), we obtain Φ,ψ*H*⊕*<sup>H</sup>* = (1*<sup>H</sup>* + *a*∗*a*)ϕ,ψ*H*, which indeed vanishes for each ϕ ∈ *D*(*a*∗*a*) iff ψ = 0. - To get a feeling for the constructions to follow, we first look at the bounded case.

Proposition B.154. *If a* = *a*∗ *is bounded and b is given by* (B.505)*, then*

$$\mathbf{C}^\*(a) = \mathbf{C}^\*(b). \tag{\text{B.547}}$$

*Furthermore,* σ(*a*) ⊂ R *and* σ(*b*) ⊂ (−1,1) *(both included as compact subsets) are homeomorphic via the maps* (B.503) *-* (B.504)*, preserving eigenvalues, that is,*

$$\sigma(a) = \{ \mu (1 - \mu^2)^{-1/2} \mid \mu \in \sigma(b) \};\tag{B.548}$$

$$\sigma(b) = \{\mathcal{X}(1+\mathcal{X}^2)^{-1/2} \mid \mathcal{X} \in \sigma(a)\};\tag{B.549}$$

$$\sigma\_p(a) = \{ \mu (1 - \mu^2)^{-1/2} \mid \mu \in \sigma\_p(b) \};\tag{B.550}$$

$$\sigma\_p(b) = \{\mathcal{\lambda}(1+\mathcal{\lambda}^2)^{-1/2} \mid \mathcal{\lambda} \in \sigma\_p(a)\}.\tag{B.551}$$

*Proof.* By Theorem B.84 and Theorem B.93, σ(*a*) ⊂ R and σ(*b*) ⊆ [−1,1] are compact. We now show that in fact σ(*b*) ⊂ (−1,1); in particular, ±1 ∈/ σ(*b*). For if <sup>±</sup><sup>1</sup> <sup>∈</sup> <sup>σ</sup>(*b*), then *<sup>b</sup>* <sup>∓</sup> 1 is not invertible, so that, given that <sup>1</sup>*<sup>H</sup>* <sup>+</sup>*a*<sup>2</sup> is invertible, by (B.505) the operator <sup>1</sup>*<sup>H</sup>* <sup>+</sup>*a*<sup>2</sup> <sup>±</sup>*<sup>a</sup>* is not invertible. But since the function

$$f\_{\pm}(\mathbf{x}) = \sqrt{1 + \mathbf{x}^2} \pm \mathbf{x} \tag{\text{B.552}}$$

is strictly positive on any compact subset of R, and

$$
\sqrt{1\_H + a^2} \pm a = f\_{\pm}(a),
\tag{B.553}
$$

the operator in question *is* invertible, with inverse *<sup>f</sup>*±(*a*)−<sup>1</sup> = (1/ *<sup>f</sup>*±)(*a*). Contradiction. Having thus localized σ(*b*), it follows that *y*−<sup>1</sup> in (B.504) is continuous on <sup>σ</sup>(*b*), so that, with *<sup>a</sup>* <sup>=</sup> *<sup>y</sup>*−1(*b*), we have *<sup>a</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*b*) and hence *<sup>C</sup>*∗(*a*) <sup>⊆</sup> *<sup>C</sup>*∗(*b*). Similarly, *b* = *y*(*a*) and hence *C*∗(*b*) ⊆ *C*∗(*a*), whence (B.547).

Eqs. (B.550) - (B.551) for follows from the explicit construction of the square root in the proof of Lemma B.141: if *<sup>c</sup>*<sup>ψ</sup> <sup>=</sup> λψ, then <sup>√</sup>*c*<sup>ψ</sup> <sup>=</sup> <sup>√</sup> λψ. Likewise (more trivially), if *<sup>c</sup>* is invertible (whence <sup>λ</sup> <sup>=</sup> 0), then *<sup>c</sup>*−1<sup>ψ</sup> <sup>=</sup> <sup>λ</sup> <sup>−</sup>1ψ. The same result for the full spectra follows either from the spectral mapping property (C.53), or from the following direct argument. Given (B.547), Theorem B.94 yields an isomorphism *C*(σ(*a*)) ∼= *C*(σ(*b*)) of commutative C\*-algebras, since we have

$$\mathcal{C}(\sigma(a)) \xrightarrow[\cong]{f \mapsto f(a)} \mathcal{C}^\*(a) = \mathcal{C}^\*(b) \xleftarrow[\_{\cong}]{g(b) \longleftrightarrow \text{g}} \mathcal{C}(\sigma(b)). \tag{B.554}$$

Eqs. (B.548) - (B.549) then follow from the identities

$$f(a) = (f \circ \text{y}^{-1})(b), \; f \in \mathcal{C}(\sigma(a));\tag{B.555}$$

$$\mathbf{g}(b) = (\mathbf{g} \circ \mathbf{y})(a), \; \mathbf{g} \in C(\sigma(b)), \tag{\text{B.556}}$$

which in turn follow from Theorem B.94. -

Now suppose *a* is unbounded. In that case, its bounded transform *b* remains bounded, but its spectrum contains at least one of the points ±1. We abbreviate

$$
\tilde{\sigma}(b) = \sigma(b) \cap (-1, 1). \tag{B.S.7}
$$

Proposition B.155. *If a and b are as in Theorem B.152, their spectra are related by*

$$\sigma(a) = \{ \mu (1 - \mu^2)^{-1/2} \mid \mu \in \tilde{\sigma}(b) \};\tag{B.558}$$

$$\sigma(b) = \{\lambda(1+\lambda^2)^{-1/2} \mid \lambda \in \sigma(a)\}^{-}. \tag{B.559}$$

$$\mathfrak{G}\_p(a) = \{ \mu (1 - \mu^2)^{-1/2} \mid \mu \in \mathfrak{G}\_p(b) \};\tag{B.560}$$

$$\mathfrak{G}\_p(b) = \{ \lambda (1 + \lambda^2)^{-1/2} \mid \lambda \in \mathfrak{G}\_p(a) \}. \tag{B.561}$$

If *a* is bounded this duly reduces to (and reproves) eqs. (B.548) - (B.551), since σ(*b*)∩(−1,1) = σ(*b*), and the right-hand side of (B.559) is already closed in R.

Lemma B.156. *Let a* = *a*<sup>∗</sup> ∈ *B*(*H*)*. Then the spectrum* σ(*a*) *according to Definition B.80 coincides with the set* σ(*a*) *in Definition B.81, where A* = *C*∗(*a*)*.*

*Proof.* We must show that if (*<sup>a</sup>* <sup>−</sup> <sup>λ</sup>)−<sup>1</sup> exists in *<sup>B</sup>*(*H*), then its exists in *<sup>C</sup>*∗(*a*) (in the double sense that (*a*−λ)−<sup>1</sup> lies in *<sup>C</sup>*∗(*a*) and is the inverse of (*a*−λ) in *<sup>C</sup>*∗(*a*)); the converse is trivial. Using Theorem B.94 as well as the obvious invariance of the spectrum (as in Definition B.81) under isomorphism, we might as well show that if (*a*−λ)−<sup>1</sup> exists in *<sup>B</sup>*(*H*), then the function (idσ(*a*) <sup>−</sup>λ)−<sup>1</sup> exists in *<sup>C</sup>*(σ(*a*)). This is the case, since, by definition of σ(*a*), the antecedent holds iff λ ∈/ σ(*a*). -

We apply this lemma with *a b* in order to prove Proposition B.155.

*Proof.* We know from (B.516) that <sup>1</sup>*<sup>H</sup>* <sup>−</sup>*b*<sup>2</sup> : *<sup>H</sup>* <sup>→</sup> *<sup>D</sup>*(*a*) is a bijection. If <sup>λ</sup> <sup>∈</sup> <sup>ρ</sup>(*a*), then both maps in the following diagram are bijections:

$$H \stackrel{\sqrt{1\_{H} - b^{2}}}{\\\longrightarrow} D(a) \stackrel{a - \lambda}{\longrightarrow} H,\tag{B.562}$$

and this is the case iff (*a* − λ) ◦ <sup>1</sup>*<sup>H</sup>* <sup>−</sup>*b*<sup>2</sup> is invertible, which, using (B.505), is true iff *b* − λ <sup>1</sup>*<sup>H</sup>* <sup>−</sup>*b*<sup>2</sup> is invertible. Hence <sup>λ</sup> <sup>∈</sup> <sup>σ</sup>(*a*) iff *<sup>b</sup>* <sup>−</sup> <sup>λ</sup> <sup>1</sup>*<sup>H</sup>* <sup>−</sup>*b*<sup>2</sup> <sup>∈</sup> *<sup>C</sup>*∗(*b*) is not invertible in *B*(*H*), or, equivalently, in *C*∗(*b*). Define *g*<sup>λ</sup> (*y*) = *y*−λ <sup>1</sup>*<sup>H</sup>* <sup>−</sup>*y*<sup>2</sup> in *C*(σ(*b*)), so that *g*<sup>λ</sup> (*b*) = *b* − λ <sup>1</sup>*<sup>H</sup>* <sup>−</sup>*b*2. Theorem B.94 (again with *<sup>a</sup> <sup>b</sup>*) then implies that λ ∈ σ(*a*) iff *g*<sup>λ</sup> is not invertible in *C*(σ(*b*)), which according to (B.253) (with λ = 0) is true iff 0 ∈ ran(*g*<sup>λ</sup> ). Since *g*<sup>λ</sup> (±1) = ±1 = 0, even if <sup>±</sup><sup>1</sup> <sup>∈</sup> <sup>σ</sup>(*b*), these values play no role, so that 0 <sup>∈</sup> ran(*g*<sup>λ</sup> ) iff <sup>λ</sup> <sup>=</sup> <sup>μ</sup>(1<sup>−</sup> <sup>μ</sup>2)<sup>−</sup> <sup>1</sup> <sup>2</sup> for some μ ∈ σ(*b*)∩(−1,1). This yields (B.558) for σ(*a*) and σ(*b*).

The claimed refinement to the point spectrum follows as in the proof of Proposition B.154. The same argument shows that any μ ∈ σ(*b*)∩(−1,1) must come from λ ∈ σ(*a*), and since σ(*b*) must be a closed subset of [−1,1], this gives (B.559). - As an illustration, take *a* to be the position operator on *H* = *L*2(R), so that *b* = *mf* with *f*(*x*) = *x*/ √ 1+*x*2. Eq. (B.276) then gives σ(*a*) = R and σ(*b*)=[−1,1].

If *a* is bounded, there are only two (commutative) C\*-algebras to be concerned with in a spectral theorem a la Theorem B.94, viz. ` *C*(σ(*a*)) and *C*∗(*a*). In the unbounded case, where σ(*a*) ⊆ R is no longer compact, already no fewer than four algebras of continuous functions are associated with the spectrum, namely (cf. §B.3):


Of these, the second and the third are commutative C\*-algebras in the supremumnorm; the first fails to be closed in this norm, whereas the last does not carry it (as it would be infinite on any unbounded function). We have the obvious inclusions

$$\mathcal{C}\_c(\sigma(a)) \subset \mathcal{C}\_0(\sigma(a)) \subset \mathcal{C}\_b(\sigma(a)) \subset \mathcal{C}(\sigma(a)).\tag{B.563}$$

Each of these plays a role in spectral theory (as do measurable versions of them). On the side of the bounded operator *b*, on top of *C*(σ(*b*)), we have analogous function algebras, this time with inclusions

$$C\_c(\tilde{\sigma}(b)) \subset C\_0(\tilde{\sigma}(b)) \subset C(\sigma(b)) \subset C\_b(\tilde{\sigma}(b)) \subset C(\tilde{\sigma}(b)),\tag{B.564}$$

since *C*(σ(*b*)) consists of all functions *g* in *Cb*(σ˜(*b*)) for which lim*y*→±<sup>1</sup> *g*(*y*) exists, which limit is equal to zero iff *<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*0(σ˜(*b*)). Since *<sup>y</sup>*−<sup>1</sup> : (−1,1) <sup>→</sup> <sup>R</sup> in (B.504) restricts to a homeomorphism σ˜(*b*) → σ(*a*) because of (B.558), the map

$$\mathcal{C}\_{\bullet}(\sigma(a)) \stackrel{\cong}{\to} \mathcal{C}\_{\bullet}(\tilde{\sigma}(b)), \; f \mapsto f \circ \mathbf{y}^{-1}, \tag{\mathsf{B.565}}$$

is an isomorphism for • = *c*,0,*b*, or blank (which is isometric for 0 and *b*). If *f* ∈ *C*0(σ(*a*)), as in (B.555) (but no longer assuming *a* to be bounded), we may define

$$f(a) = (f \circ \mathbf{y}^{-1})(b),\tag{\text{B.566}}$$

since *<sup>f</sup>* ◦ *<sup>y</sup>*−<sup>1</sup> <sup>∈</sup> *<sup>C</sup>*0(σ˜(*b*)), and in view of (B.564), the right-hand side is defined by the continuous functional calculus for *b*, i.e., *g* → *g*(*b*), where *g* ∈ *C*(σ(*b*)); the same is then true for *f* ∈ *Cc*(σ(*a*)). Let the (typically non-unital) <sup>∗</sup>-algebras

$$\mathcal{C}\_c^\*(b) = \{ \mathbf{g}(b) \mid \mathbf{g} \in \mathcal{C}\_c(\tilde{\sigma}(b)) \};\tag{\mathbf{B}.567}$$

$$\mathcal{C}\_0^\*(b) = \{ \mathbf{g}(b) \mid \mathbf{g} \in \mathcal{C}\_0(\tilde{\sigma}(b)) \},\tag{\mathsf{B.568}}$$

be the pertinent images under this calculus. In view of (B.568), we then have

$$C\_c^\*(b) \subset C\_0^\*(b) \subset C^\*(b) \subset M(C\_0^\*(b)) \subset M(C\_c^\*(b)),\tag{B.569}$$

where *M*(*C*∗ <sup>0</sup> (*b*)) and *M*(*C*<sup>∗</sup> *<sup>c</sup>* (*b*)) are the multiplier algebras of *C*<sup>∗</sup> <sup>0</sup> (*b*) and *C*<sup>∗</sup> *<sup>c</sup>* (*b*), respectively, cf. §C.10. Note that *M*(*C*<sup>∗</sup> <sup>0</sup> (*b*)) is a C\*-algebra contained in *B*(*H*), whereas *M*(*C*∗ *<sup>c</sup>* (*b*)) consists (partly) of unbounded operators (see below).

Lemma B.157. *The (finite) linear span C*∗ *<sup>c</sup>* (*b*)*H of all vectors of the form g*(*b*)ψ*, where g* ∈ *Cc*(σ˜(*b*)) *and* ψ ∈ *H, is dense in H, i.e., C*<sup>∗</sup> *<sup>c</sup>* (*b*)*H*<sup>−</sup> = *H.*

This would be trivial for *C*∗(*b*)*H*, since unlike *C*∗ *<sup>c</sup>* (*b*)*H* it contains the unit 1*H*.

*Proof.* Approximate 1σ˜(*b*) pointwise by some monotone increasing bounded sequence (*fn*) with compact support, cf. Lemma B.97; for example, define

$$f\_n \colon (-1, 1) \to \mathbb{R};\tag{B.570}$$

$$f\_n(\mathbf{x}) = 0 \ (\mathbf{x} \in (-1, -1 + 1/n], \mathbf{x} \in [1 - 1/n, 1));\tag{\mathbf{B.571}}$$

$$f\_n(\mathbf{x}) = 1 \ (\mathbf{x} \in [-1 + 2/n, 1 - 2/n]),\tag{\mathbf{B}.\\$72}$$

and linear interpolation elsewhere. As in (B.317), we then have *fn*(*b*) → 1*<sup>H</sup>* strongly. By definition of *C*∗ *<sup>c</sup>* (*b*), this yields the claim. -

#### Theorem B.158. *Let a be a (possibly unbounded) self-adjoint operator on H.*

*1. For any f* ∈*Cb*(σ(*a*))*, the operator f*(*a*)0*, initially defined by linear extension of*

$$f(a)\_0 h(a) \Psi = (fh)(a) \Psi = ((fh) \circ \mathbf{y}^{-1})(b) \Psi,\tag{\mathbf{B}.\\$73}$$

*i.e., defined on the domain C*∗ <sup>0</sup> (*b*)*H*<sup>−</sup> *(cf.* (B.565) *with* • = 0*), is bounded, with*

$$\|f(a)\| \le \|f\|\_{\infty},\tag{B.574}$$

*and hence extends from C*∗ <sup>0</sup> (*b*)*H to all of H by continuity; we write*

$$f(a) = f(a)\_0^{-}. \tag{B.575}$$

*2. The functional calculus f* → *f*(*a*) *from Cb*(σ(*a*)) *to B*(*H*) *thus established satisfies the algebraic rules* (B.289) *-* (B.291)*, and one has the reassuring cases*

$$1\_{\sigma(a)}(a) = 1\_H. \tag{\mathbb{B}.576}$$

$$\frac{1}{\operatorname{id}\_{\sigma(a)} - z}(a) = (a - z)^{-1} \ (z \in \mathfrak{\rho}(a)). \tag{B.577}$$

Conceptually, what is going on here is that the homomorphism

$$C\_0(\sigma(a)) \to B(H);\tag{B.578}$$

$$f \mapsto f(a),\tag{B.579}$$

as defined in (B.566), is extended to the multiplier algebra

$$M(\mathcal{C}\_0(\sigma(a))) = \mathcal{C}\_b(\sigma(a)). \tag{B.580}$$

Theorem C.77 then applies, since by Lemma B.157 the initial homomorphism is nondegenerate, immediately yielding boundedness of *f*(*a*). Below we will also give an independent proof of (B.574).

*Proof.* The operator *f*(*a*)<sup>0</sup> is densely defined by Lemma B.157 (which *a fortiori* implies that *C*∗ <sup>0</sup> (*b*)*H* is dense in *H*). To prove that *f*(*a*)<sup>0</sup> is bounded, take ε > 0 and hence find a compact subset *K* ⊂ R such that | *f*(*x*)*h*(*x*)| < ε whenever *x* ∈/ *K*. Writing ˜*<sup>f</sup>* <sup>=</sup> *<sup>f</sup>* ◦ *<sup>y</sup>*−<sup>1</sup> etc., using (B.322) with *<sup>f</sup>* <sup>1</sup> *K<sup>c</sup> f h* we obtain

$$\|\widehat{(1\_{K^c}fh)(b)}\Psi\| \le \|\widehat{(1\_{K^c}fh)(b)}\| \|\|\Psi\| \le \|\widehat{1\_{K^c}fh}\|\_{\infty} \|\|\Psi\| < \varepsilon \|\|\Psi\|. \tag{B.581}$$

From this, using also the homomorphism property in Theorem B.102, we then find

$$\begin{split} \|(fh)(a)\Psi\| &= \|(\widehat{fh})(b)\Psi\| \\ &= \|\widehat{(1\_K f h)}(b) + (\widehat{fh} - \widehat{1\_K f h})(b)\Psi\| \\ &\le \|\widehat{(1\_K f)}(b)\Psi\| + \|\widehat{(1\_K f h)}(b)\Psi\| \\ &= \|\widehat{(1\_K f)}(b)\widetilde{h}(b)\Psi\| + \|\widehat{(1\_K f h)}(b)\Psi\| \\ &< \|\widehat{(1\_K f)}\|\_{\infty} \|h(a)\Psi\| + \varepsilon \|\Psi\|, \\ &\le \|f\|\_{\infty} \|h(a)\Psi\| + \varepsilon \|\Psi\|, \tag{B.582} \end{split} \tag{B.582}$$

since

$$\|(1\_K f)\|\_{\infty} \le \|\tilde{f}\|\_{\infty} = \|f\|\_{\infty}.\tag{B.583}$$

Since the last expression in (B.582) is independent of *K*, we may let ε → 0, obtaining boundedness of *f*(*a*) as well as (B.574).

The second claim should be obvious from (B.566) and Theorem B.94.

Eq. (B.576) is trivial. To prove (B.577), write *<sup>f</sup>*(*x*)=(*x*−*z*)−1, where *<sup>z</sup>* <sup>∈</sup> <sup>ρ</sup>(*a*) is fixed and *x* ∈ σ(*a*). We have

$$f(a)\_0 h(a) \Psi = (fh)(a) \Psi = (a - z)^{-1} h(a) \Psi,\tag{B.584}$$

and hence

$$f(a)\_0 \mathfrak{g} = (a - z)^{-1} \mathfrak{g},\tag{\text{B.585}}$$

for any ϕ ∈ *D*(*f*(*a*)0) = *C*<sup>∗</sup> <sup>0</sup> (*b*)*H*. So if ϕ*<sup>n</sup>* → ϕ for ϕ ∈ *H* and ϕ*<sup>n</sup>* ∈ *D*(*f*(*a*)0), boundedness and hence continuity of the operator (*a*−*z*)−<sup>1</sup> implies

$$f(a)\mathfrak{q} = \lim\_{n \to \infty} f(a)\_0 \mathfrak{q}\_n = \lim\_{n \to \infty} (a - z)^{-1} \mathfrak{q}\_n = (a - z)^{-1} \mathfrak{q}. \tag{7}$$

To construct a (typically unbounded) operator *f*(*a*) for *f* ∈ *C*(σ(*a*)) in this fashion (think of *a* itself, corresponding to *f* = idσ(*a*)), we first define

$$D(f(a)\_0) = \mathcal{C}\_c^\*(b)H = \text{span}\{h(a)\Psi \mid h \in \mathcal{C}\_c(\sigma(a)), \Psi \in H\},\tag{B.586}$$

and an operator *f*(*a*)<sup>0</sup> : *D*(*f*(*a*)0) → *H* may once again be defined by (B.573); once again, the whole point is that although *f* may well be unbounded, *h* and hence *f h* lie in *Cc*(σ(*a*)), so that (*f h*)(*a*) is defined by (B.566), and hence eventually by the continuous functional calculus for the *bounded* self-adjoint operator *b*.

As in the remark following Theorem B.158, from the point of view of multiplier algebras, eq. (B.573) extends the (nondegenerate) homomorphism*Cc*(σ(*a*)) → *B*(*H*) to the algebra *C*(σ(*a*)) of unbounded multipliers on *Cc*(σ(*a*)).

This is not the end of the construction, since *f*(*a*)<sup>0</sup> is typically not closed on the domain (B.586). However, it is a very near miss, since *f*(*a*)<sup>0</sup> is *closable*, cf. §B.13. To prove that the operator *f*(*a*)<sup>0</sup> in B.573 is closable, we use the second criterion in Lemma B.74. For *g*,*h* ∈ *Cc*(σ(*a*)) and ψ,ϕ ∈ *H* we may compute:

$$\langle \mathfrak{g}(a)\mathfrak{g}, f(a)\_0 f(a)\mathfrak{y}\rangle = \langle \mathfrak{p}, \mathfrak{g}(a)^\ast f(a)\_0 h(a) \mathfrak{y}\rangle = \langle \mathfrak{p}, (\mathfrak{g}^\ast f h)(a) \mathfrak{y}\rangle;\qquad(\text{B.587})$$

$$\langle (gf^\*)(a)\mathfrak{g}, h(a)\mathfrak{y}\rangle = \langle \mathfrak{g}, (gf^\*)(a)^\*g(a)\mathfrak{y}\rangle = \langle \mathfrak{g}, (g^\*fh)(a)\mathfrak{y}\rangle.\tag{\text{B.588}}$$

Hence *D*(*f*(*a*)∗ <sup>0</sup>) must contain *D*(*f*(*a*)0), and on the latter we may put

$$f(a)^{\*}\_{0}\mathfrak{g}(a)\mathfrak{g} = (\mathfrak{g}f^{\*})(a)\mathfrak{g},\tag{\mathsf{B.589}}$$

as in (B.573). In particular, *D*(*f*(*a*)∗ <sup>0</sup>) is dense in *H*, so that *f*(*a*)<sup>0</sup> is closable. Furthermore, if *f* <sup>∗</sup> = *f* , then *f*(*a*)<sup>0</sup> is symmetric, i.e., *f*(*a*)<sup>0</sup> ⊂ *f*(*a*)<sup>∗</sup> <sup>0</sup>. Hence the closure

$$f(a) = f(a)\_0^- : D(f(a)) \to H,\tag{B.590}$$

is the operator we are looking for, where *D*(*f*(*a*)) consists of all ψ ∈ *H* for which there exists a sequence (ψ*n*) in *D*(*f*(*a*)0) such that ψ*<sup>n</sup>* → ψ *and f*(*a*)0ψ*<sup>n</sup>* converges, upon which Lemma B.74 gives

$$f(a)\,\Psi = \lim\_{n} f(a)\_0 \,\Psi\_n. \tag{B.591}$$

What's more, if *f* <sup>∗</sup> = *f* , then *f*(*a*)<sup>0</sup> is *essentially self-adjoint*, i.e.,

$$f(a)\_0^- = f(a)\_0^\*,\tag{B.592}$$

which (by taking the adjoint) is equivalent to the property we will actually prove:

$$f(a)^{\*} = f(a). \tag{\mathbb{B}.\\$93}$$

Theorem B.159. *For real-valued f* ∈ *C*(σ(*a*))*, the operator f*(*a*) *is self-adjoint.*

The proof of self-adjointness relies on *Nelson's Lemma*:

Lemma B.160. *Let c* ⊂ *c*<sup>∗</sup> *be densely defined and symmetric. Then c is essentially self-adjoint if there exists a continuous unitary representationt* → *ut of* R *on H such that ut* : *D*(*c*) → *D*(*c*) *for each t* ∈ R*, and*

$$\frac{du\_l}{dt}\Psi \equiv \lim\_{s \to 0} \frac{u\_{l+s}\Psi - u\_l\Psi}{s} = \operatorname{i}cu\_l\Psi,\ \Psi \in D(c). \tag{B.594}$$

This lemma is closely related to Stone's Theorem; see Theorem 5.73 in §5.12.

*Proof.* The proof of Nelson's lemma relies on the following variation of Lemma 5.74 in §5.12, proved by applying the latter (or rather its proof) to the closure of *a*:

Lemma B.161. *Let a be symmetric. Then a is essentially self-adjoint (a*∗∗ = *a*∗*) iff*

$$\text{ran}(a+i)^{-} = \text{ran}(a-i)^{-} = H.\tag{B.595}$$

Applying Lemma B.161 in the same way as Lemma 5.74 is used in the proof of self-adjointness of the generator *a* in Theorem 5.73, yields Lemma B.160.

For Theorem B.159, with *c* = *f*(*a*)<sup>0</sup> for some *f* ∈ *C*(σ(*a*),R), informally define

$$u\_l = \exp(\it f(a)),\tag{B.596}$$

and formally define *ut* as the closure of the bounded operator

$$e(\mu\_l)\_0 = e\_0^l(a) \tag{\mathbb{B}.\\$97}$$

defined by the bounded function *et* (*x*) = exp(*it f*(*x*)) on σ(*T*), cf. (B.573). The verification that *t* → *ut* defines a continuous one-parameter group of unitary operators on *H* is practically the same as in our proof of part 1 of Stone's Theorem, and the proof of (B.594) is almost the same as a similar step in the proof of part 3 of that theorem, so we will not repeat these here. Therefore, Lemma B.160 applies, showing that *f*(*a*)<sup>0</sup> is essentially self-adjoint. -

As an important special case of our continuous functional calculus, we have

$$\operatorname{id}\_{\sigma(a)}(a) = a,\tag{B.598}$$

just as in the bounded case. Writing *a*<sup>0</sup> for the operator (idσ(*<sup>a</sup>*))0(*a*), eq. (B.573) gives *a*0ϕ = *a*ϕ for ϕ ∈ *D*(*a*0), cf. (B.586). Let ψ ∈ *D*(*a*<sup>−</sup> <sup>0</sup> ), so that there is a sequence (ψ*n*) in *D*(*a*0) such that ψ*<sup>n</sup>* → ψ and (*a*0ψ*n*) converges. Since *a* is closed, it follows that *a*0ψ*<sup>n</sup>* = *a*ψ*<sup>n</sup>* → *a*ψ, so that ψ ∈ *D*(*a*). Hence *a*<sup>−</sup> <sup>0</sup> ⊆ *a*. Since both operators are self-adjoint, this implies *a*− <sup>0</sup> = *a*, which proves (B.598). The proof of (B.577) is similar but easier, since (*a*−*z*)−<sup>1</sup> is bounded.

In similar vein, we may set up a functional calculus for bounded Borel functions of *<sup>a</sup>*. If *<sup>f</sup>* <sup>∈</sup> <sup>B</sup>(σ(*a*)), then *<sup>f</sup>* ◦ *<sup>y</sup>*−<sup>1</sup> <sup>∈</sup> <sup>B</sup>(σ(*b*)), so that (*<sup>f</sup>* ◦ *<sup>y</sup>*−1)(*b*) is defined, cf. Theorem B.102, and we may define *f*(*a*) by (B.566). As in the continuous case, this map *f* → *f*(*a*) yields a homomorphism B(σ(*a*)) → *B*(*H*), satisfying (B.322).

What is still missing, however, is the von Neumann algebra *W*∗(*a*) in which this homomorphism takes values. To close this section, we solve this issue.

If *c* ∈ *B*(*H*) and *a* is possibly unbounded, we say that (by convention):

$$[a,c] = 0 \text{ iff } ca \subseteq ac,\tag{B.599}$$

that is, if *c* · *D*(*a*) ⊆ *D*(*a*) and *ca*ψ = *ac*ψ for each ψ ∈ *D*(*a*). We write {*a*} for the set of all *c* ∈ *B*(*H*) that commute with *a*. If *a*<sup>∗</sup> = *a*, looking at the graph of *a* (and using the fact that *a* is closed), it is easy to see that {*a*} is a strongly closed unital <sup>∗</sup>-subalgebra in *B*(*H*). Therefore, by the bicommutant theorem, {*a*} is a von Neumann algebra. Its commutant *W*∗(*a*), defined in the usual sense (B.318), i.e.,

$$W^\*(a) = \{a\}^{\prime\prime}.\tag{\text{B.600}}$$

Theorem B.162. *Let a be a (possibly unbounded) self-adjoint operator on H. Then*

$$W^\*(a) = W^\*(b),\tag{B.601}$$

*where b is the bounded transform* (B.510) *of a. Consequently, if f* ∈ B(σ(*a*)) *and the operator f*(*a*) *is defined by* (B.566) *and Theorem B.102, then f*(*a*) ∈ *W*∗(*b*)*.*

*Proof.* We will prove a more general result of independent interest.

Definition B.163. *A closed unbounded operator a* : *D*(*a*) → *H is* affiliated *to a von Neumann algebra A* ⊂ *B*(*H*)*, written a*η*A, iff* [*a*, *c*] = 0 *for each c* ∈ *A .*

For example, if *a*<sup>∗</sup> = *a*, then *a*η*W*∗(*a*), and if *a*η*B* for some *B* = *B*, then *W*∗(*a*) ⊆ *B*.

Proposition B.164. *Let A* ⊂ *B*(*H*) *be a von Neumann algebra and assume a is a self-adjoint operator on H with bounded transform b. Then a*η*A iff b* ∈ *A.*

*Proof.* The first step consists in the observation that *a*η*A* iff [*a*,*u*] = 0 (or, equivalently, *uau*<sup>∗</sup> = *a*) merely for each unitary *u* ∈ *A* . To see this, we strengthen Lemma B.145 (in which we replace *a* by *c*): if *c* ∈ *A* , then *c* is a linear combination of at most four unitaries *in A* . Indeed, the unitaries *u*<sup>±</sup> in the proof are constructed via the continuous functional calculus of Theorem B.94, and hence they lie in *C*∗(*c*) ⊂ *A* .

The second step is to show that [*a*,*u*] = 0 iff [*b*,*u*] = 0 for any unitary *u*. This is a simple computation: if *uau*∗ = *a*, then, looking at the domains in question,

$$
u(1\_H + a^2)^{-1} 
u^\* = (1\_H + a^2)^{-1};\tag{B.602}$$

$$
\mu ((1\_H + a^2)^{-1})^{1/2} u^\* = ((1\_H + a^2)^{-1})^{1/2},\tag{B.603}
$$

from which *ubu*∗ = *b* with *b* defined by (B.510). Similarly, if *bu* = *ub*, then *uau*∗ = *a*, where *a* is defined by (B.511). Theorem B.152 therefore yields the claim. -

Theorem B.162 now follows: taking *A* = *W*∗(*a*), so that *a*η*A*, yields *b* ∈ *W*∗(*a*), and hence *W*∗(*b*) ⊆ *W*∗(*a*). On the other hand, taking *A* = *W*∗(*b*), in which case *b* ∈ *A*, gives *a*η*W*∗(*b*), and hence *W*∗(*a*) ⊆ *W*∗(*b*). This yields (B.601), from which the final claim follows by our definition (B.566) and Theorem B.102. -

Using this language, it can be shown that for possibly unbounded Borel functions *f* on σ(*a*), the possibly unbounded operator *f*(*a*) is affiliated to *W*∗(*a*). Furthermore, there exists a Borel measure μ on σ(*a*) such that the map *f* → *f*(*a*) may also be seen as a so-called *essential* homomorphism from B(σ(*a*))/N (σ(*a*)) into the <sup>∗</sup> algebra of normal operators affiliated with *W*∗(*a*), where N (σ(*a*)) is the set of μ-null functions on σ(*a*); this means that the algebraic properties hold after closure.

#### Notes

The history of functional analysis is described from various points of view by Bernkopf (1966, 1967), Birkhoff & Kreyszig (1984), Brezis & Browder (1998), Dieudonne (1981), Monna (1973), Pier (2001), Pietsch (2007), Siegmund-Schultze ´ (2003), and Steen (1973). Apart from von Neumann (1932), the other founding books of functional analysis—coincidentally from the same year, which closed the foundational era that began around 1900—were Banach (1932) and Stone (1932).

The concept of a Hilbert space eventually emerged from Hilbert's work on quadratic forms in infinitely many variables (see especially his fourth paper on the subject, Hilbert, 1906), which in turn was inspired by his analysis of integral equations (Hilbert, 1912). From a modern point of view, Hilbert's space was the unit ball in -<sup>2</sup>(N); he did not adopt the perspective of linear spaces and operators.

An important step towards this perspective was what is now called the Riesz– Fischer Theorem from 1907; Riesz (1907a) proved the isomorphism

$$L^2([a,b]) \cong \ell^2(\mathbb{N}),\tag{\mathbb{B}.604}$$

whereas Fischer (1907) proved the completeness of *L*2([*a*,*b*]) and obtained Riesz's isomorphism as a corollary. Riesz (1907b) also obtained the the Riesz–Frechet The- ´ orem for the special case *L*2([*a*,*b*])), independently found also by Frechet (1907). ´ In fact, Hilbert (1906) had already shown this (*mutatis mutandis*) for what we now call -<sup>2</sup>(N); the general case had to wait for Riesz (1934) and Lowig (1934). The ¨ latter was the first to study non-separable Hilbert spaces, including Corollary B.64. Both Riesz and Frechet in addition played major roles in establishing another fa- ´ mous duality theorem, namely the one on the representation of linear functionals on continuous functions by measures (cf. Theorem B.15 etc.); see Gray (1984).

Subsequently, Schmidt (1908) developed the linear and geometric structure of -<sup>2</sup>(N), arguably the first Hilbert space studied as such, and Riesz (1913) explicitly studied linear operators on this space. Finally, it was von Neumann (1927ab, 1932) who first introduced Hilbert space and operator theory from an abstract point of view, i.e., axiomatically. For a historical analysis of this step, which was triggered by the attempts of von Neumann (originally jointly with Hilbert and Nordheim) to provide a mathematical foundation for quantum mechanics), see Redei (2005) and ´ Duncan & Janssen (2013); also cf. Corry (2004) on the role of Hilbert himself.

Functional analysis textbooks perused by the author include Conway (2007), Dudley (1989), Kadison & Ringrose (1983), Maurin (1972), Reed & Simon (1972), Rudin (1973), Schmudgen (2012), and Weidmann ( 2000). A good place to start for ¨ contemporary beginners is Rynne & Youngson (2008), followed by the more advanced text by MacCluer (2009), which also introduces C\*-algebras. A natural next step would then be Pedersen (1989), and on to operator algebras!

Since most of the material in this appendix is standard except for the last three sections, it seems pointless to give detailed notes and attributions (so that several section even lack notes), except for a few comments on unusual cases, and some supplementary material which would have distracted too much from the main text.

Notes 639

#### §B.2. *<sup>p</sup>* spaces

Holder's Inequality (which incorporates the claim ¨ *f g* ∈ -1) should be clear for *p* = 1 or *p* = ∞. For 1 < *p* < ∞, we use the fact that for any *s*,*t* ∈ [0,∞), one has

$$s^{1/p}t^{1/q} \le \frac{s}{p} + \frac{t}{q}.\tag{B.605}$$

Using (B.605) with *<sup>s</sup>* = (<sup>|</sup> *<sup>f</sup>*(*x*)|/ *<sup>f</sup> p*)*<sup>p</sup>* and *<sup>t</sup>* = (|*g*(*x*)|/*gq*)*<sup>q</sup>* and summing over *<sup>x</sup>* gives (B.15). To derive Minkowski's Inequality for 1 < *p* < ∞ (the cases *p* = 1 and *p* = ∞ are obvious), define

$$h(\mathbf{x}) = |f(\mathbf{x}) + \mathbf{g}(\mathbf{x})|^{p-1}. \tag{\mathbf{B}.606}$$

Arguing as in part 1 above, if *f* ∈ *<sup>p</sup>* and *<sup>g</sup>* <sup>∈</sup> *<sup>p</sup>*, then *<sup>f</sup>* <sup>+</sup>*<sup>g</sup>* <sup>∈</sup> *<sup>p</sup>* and hence *<sup>h</sup>* <sup>∈</sup> *q*, since *<sup>h</sup>*(*x*)*<sup>q</sup>* <sup>=</sup> <sup>|</sup>*h*(*x*)<sup>|</sup> *<sup>q</sup>* <sup>=</sup> <sup>|</sup> *<sup>f</sup>*(*x*) +*g*(*x*)<sup>|</sup> *<sup>p</sup>*. Now compute

$$\begin{split} \|f+g\|\_{p}^{p} &= \sum\_{\mathbf{x}} |f(\mathbf{x}) + \mathbf{g}(\mathbf{x})|^{p} = \sum\_{\mathbf{x}} h(\mathbf{x}) |f(\mathbf{x}) + \mathbf{g}(\mathbf{x})| \\ &\leq \sum\_{\mathbf{x}} |h(\mathbf{x})f(\mathbf{x})| + \sum\_{\mathbf{x}} |h(\mathbf{x})\mathbf{g}(\mathbf{x})| = ||fh||\_{1} + ||gh||\_{1} \\ &\leq ||h||\_{q} (||f||\_{p} + ||\mathbf{g}||\_{p}) = ||f+\mathbf{g}||\_{p}^{p-1} (||f||\_{p} + ||\mathbf{g}||\_{p}), \end{split} \tag{\mathbf{B.607}}$$

where in the last inequality we have used (B.15). This immediately gives (B.14).

§B.4. Basic measure theory Standard textbooks on measure theory include Bogachev (2006), Dudley (1989), Malliavin (1995), Rudin (1986), etc.

#### §B.5. Measure theory on locally compact Hausdorff spaces

*Urysohn's Lemma* states that if *X* is a locally compact Hausdorff space and *K* ⊂ *U* ⊂ *X* with *K* compact and *U* open, then there is a function *g* ∈ *Cc*(*U*) such that 0 ≤ *g*(*x*) ≤ 1 for each *x* ∈ *X* and *g*(*x*) = 1 for *x* ∈ *K*. Similarly, since a locally compact Hausdorff space is completely regular, for each closed set *F* ⊂ *X* and point *x* ∈/ *F* there is a continuous function such that *f*(*x*) = 0 and *f*|*<sup>F</sup>* = 0.

An example of a space that is locally compact Hausdorff but not σ-compact, given by Rudin ((1986), is *X* = R<sup>2</sup> with topology given by the strange metric *d*((*x*, *y*),(*x* , *y* )) = 1+|*y*−*y* | if *x* = *x* and *d*((*x*, *y*),(*x*, *y* )) = |*y*−*y* |.

For a (tedious) direct proof of Theorem B.19, see Rudin (1986), Thm. 2.14. Alternatively, Theorem B.19 may be derived from Choquet theory, as mentioned in the main text, or from the Daniell–Stone construction of measures from positive functionals in a more general setting, see e.g. Bogachev, 2007, §7.8 or Dudley, 1989, §4.5. For a proof of Theorem B.22 see Malliavin (1995), Thm. 5.3.8.

The theory of finitely additive measures is exhaustively discussed in Rao & Rao (1983); for a summary see Luxemburg (1991). The notion of a semiring of subsets of *X* goes back to von Neumann (1950). See also Loya (2008), including a detailed proof that Step(*X*,R) is a (commutative) algebra.

## §B.6. *<sup>L</sup><sup>p</sup>* spaces

An nice result "taming" *Lp*(*X*,Σ,μ) is *Lusin's Theorem*, assuming μ is regular:

Theorem B.165. *Let* <sup>1</sup> <sup>≤</sup> *<sup>p</sup>* <sup>&</sup>lt; <sup>∞</sup>*. If the support of f* <sup>∈</sup> *<sup>L</sup>p*(*X*) *has finite measure, then for any* ε > 0 *there exists g* ∈ *Cc*(*X*) *such that* μ({*x* ∈ *X* | *f*(*x*) = *g*(*x*)}) < ε*.*

## §B.7. Morphisms and isomorphisms of Banach spaces

The *Baire Category Theorem* states that a *complete* metric space cannot be a countable union of nowhere dense sets (where a set in a topological space is called *nowhere dense* if its closure has empty interior, i.e., does not contain a non-empty open set). In other words, if (*M*,*d*) is complete and *M* = ∪*nMn* with each *Mn* closed, then there is at least one *n* ∈ N for which *Mn* contains an open ball.

#### §B.9. Duality

The idea of writing (B.136) as lim*<sup>U</sup> f* has the following origin.

1. Let *f* : *X* → *K* be any function between any pair of sets, and let *F* be a filter on *<sup>X</sup>*. Then *<sup>f</sup>*∗*F*, which consists of all *<sup>B</sup>* <sup>⊂</sup> *<sup>K</sup>* for which *<sup>f</sup>* <sup>−</sup>1(*B*) <sup>∈</sup> *<sup>F</sup>*, is a filter on *K*, called the *push-forward* of *F* by *f* . Moreover, if *U* is an ultrafilter on *X*, then *f*∗*U* is an ultrafilter on *K*. This gives a map

$$f\_\* : \operatorname{Ultra}(X) \to \operatorname{Ultra}(K). \tag{B.608}$$

If we equip Ultra(*X*) with the topology generated by all sets of the form

$$U\_A = \{ U \in \mathcal{B}X \mid A \in U \},\tag{B.609}$$

where *A* ⊂ *X*, as in the main text, and likewise Ultra(*K*), then *f*<sup>∗</sup> is continuous If *X* is discrete, then Ultra(*X*) = β*X*, but not otherwise.


$$\lim\_{F} f = z \ (z \in K). \tag{B.610}$$

4. As for sequences, it can be shown that filters on Hausdorff spaces have *at most* one limit, and that ultrafilters on compact spaces have *at least* one limit. Consequently, ultrafilters on compact Hausdorff spaces *K* have *exactly* one limit, i.e., converge to a unique point. This gives a continuous map

$$\text{lim}: \text{Ultra}(K) \to K. \tag{B.611}$$

5. It follows that if *X* is any set (seen as a discrete topological space), *K* is a compact Hausdorff space, *f* : *X* → *K* is some function, and *U* is an ultrafilter on *X*, then *f*∗*U* has a unique limit *z* ∈ *K*, written lim*<sup>U</sup> f* = *z* or lim *f*∗*U* = *z*, or β *f*(*U*) = *z*, since the latter notation gives the extension β *f* in the diagram (B.135). Thus β *f* = lim◦ *f*∗, as in the diagram that combines (B.608) and (B.611), viz.

$$
\beta X = \text{Ulltra}(X) \xrightarrow{f\_\*} \text{Ulltra}(K) \xrightarrow{\text{lin}} K. \tag{B.612}
$$

## §B.11. Choquet's Theorem

Our proof of Choquet's Theorem was adapted from Simon (2011) and Ebbesen (2012). For an extensive treatment of the surrounding *Choquet Theory* see e.g. Alfsen (1970), Bratteli & Robinson (1987), or Phelps (2001). For the Schlafli clas- ¨ sification see Coxeter (1948).

## §B.12. A precis of infinite-dimensional Hilbert space ´

To prove separability of *H* = *L*2(R*d*), note that a dense subset is given by the set of all functions of the form 1*Bd <sup>n</sup> <sup>p</sup>*, where *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>, *Bd <sup>r</sup>* <sup>=</sup> {*<sup>x</sup>* <sup>∈</sup> <sup>R</sup>*<sup>d</sup>* | *x*<sup>2</sup> <sup>≤</sup> *<sup>r</sup>*} is the *<sup>d</sup>*-ball of radius *r*, and *p* is some polynomial on R*<sup>d</sup>* with rational coefficients. Alternatively, take the complex rational linear span of all functions of the form 1*A*, where *<sup>A</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>d</sup>* is a rectangle with rational coefficients (proving density in either case requires some measure theory). The latter construction has the advantage over the former that it can be generalized to Hilbert spaces *H* = *L*2(*X*) for which the underlying measure space (*X*,Σ,μ) satisfies the condition that the space of sets *A* ∈ Σ with μ(*A*) < ∞ is separable in the metric *<sup>d</sup>*(*F*,*G*) = <sup>μ</sup>(*F*Δ*G*), where *<sup>F</sup>*Δ*<sup>G</sup>* = (*<sup>E</sup>* <sup>∩</sup>*Fc*)∪(*E<sup>c</sup>* <sup>∩</sup>*F*) is the symmetric difference. Indeed, *L*2(*X*) is separable iff this condition is satisfied.

This class includes the important case where the underlying topological space *X* is *Polish* (i.e., homeomorphic to a complete separable metric space), Σ consists of the associated Borel sets, and μ is a σ-finite regular measure. If, furthermore, μ is finite, then Lemma B.121 (in its original form for Polish spaces) applies. As in the proof of Theorem B.118, this induces Hilbert space isomorphisms like (in the second case) *L*2(*X*) ∼= *L*2(0,1), which do not require a choice of basis. See Royden (1988), Thm. 15.5.16 and Prop. 15.5.12, and Halmos (1974), p. 177.

#### §B.14. Basic spectral theory

Our terminology "*continuous spectrum*" σ*c*(*a*) for the complement of the point spectrum σ*p*(*a*) is not standard; many authors reserve the former term for the complement of σ*p*(*a*) *as well as* the so-called *residual spectrum* σ*r*(*a*), which is defined as the set of those λ ∈ σ(*a*) for which λ ∈/ σ*p*(*a*) and ran(*a*−λ)<sup>−</sup> = *H*. However, for self-adjoint operators *a* (which is all we need in this book, and in quantum mechanics), it follows from e.g. Theorem B.93 that σ*r*(*a*) = 0, so that at least for / *a*<sup>∗</sup> = *a* "our" continuous spectrum σ*c*(*a*) matches with the usual terminology.

The proof of (B.258) in any Banach algebra *A* with unit 1*<sup>A</sup>* is as follows. We first show that the sum is a Cauchy sequence. Indeed, for *n* > *m* one has

$$\left\| \sum\_{k=0}^{n} a^k - \sum\_{k=0}^{m} a^k \right\| = \left\| \sum\_{k=m+1}^{n} a^k \right\| \le \sum\_{k=m+1}^{n} \|a^k\| \le \sum\_{k=m+1}^{n} \|a\|^k. \tag{B.613}$$

For *n*,*m* → ∞ this converges to 0 by the theory of the geometric series. Since *A* is complete, the Cauchy sequence ∑*<sup>n</sup> <sup>k</sup>*=<sup>0</sup> *ak* converges for *<sup>n</sup>* <sup>→</sup> <sup>∞</sup>. Now compute

$$\sum\_{k=0}^{n} a^k (1\_A - a) = \sum\_{k=0}^{n} (a^k - a^{k+1}) = 1\_A - a^{n+1}.\tag{B.614}$$

Hence

642 B Basic functional analysis

$$\left\|1\_A - \sum\_{k=0}^n a^k (1\_A - a) \right\| = \left\| a^{n+1} \right\| \le \left\| a \right\|^{n+1},\tag{B.615}$$

which converges to zero when *n* → ∞, as *a* < 1 by assumption. Thus

$$\lim\_{n \to \infty} \sum\_{k=0}^{n} a^k (1\_A - a) = 1\_A. \tag{B.616}$$

By a similar argument,

$$\lim\_{n \to \infty} (1\_A - a) \sum\_{k=0}^n a^k = 1\_A,\tag{B.617}$$

so that, by continuity of multiplication in a Banach algebra, one finally has

$$\lim\_{n \to \infty} \sum\_{k=0}^{n} a^k = (1\_A - a)^{-1}. \tag{B.618}$$

To see that the closure *a*− of a closable operator *a* is indeed closed (!), suppose *fn* → *f* and *a fn* → *g*, with (*fn*) in *D*(*a*−). Since *fn* ∈ *D*(*a*−) for fixed *n*, there exists (*fm*,*n*) in *D*(*a*) such that lim*<sup>m</sup> fm*,*<sup>n</sup>* = *fn* and lim*<sup>m</sup> a fm*,*<sup>n</sup>* ≡ *gn* exists. Then clearly

$$\lim\_{m,n} f\_{m,n} = f,\tag{B.619}$$

and we claim that

$$\lim\_{m,n} af\_{m,n} = \text{g.}\tag{\text{B.620}}$$

Namely, *a fm*,*<sup>n</sup>* − *g*≤*a fm*,*<sup>n</sup>* − *a fn* + *a fn* − *g*. For ε > 0, take *n* so that the second term is < ε/2. For that *n*, the vectors *a*(*fm*,*<sup>n</sup>* − *fn*) converge, as *m* → ∞, since *a fm*,*<sup>n</sup>* → *gn* and *a fn* is independent of *m*. Also, recall that *fm*,*<sup>n</sup>* − *fn* → 0 as *m* → ∞. By assumption, *a* is closable, hence by definition one must have *a*(*fm*,*<sup>n</sup>* − *fn*) → 0 in *m*. Hence we may find *m* so that *a fm*,*<sup>n</sup>* − *a fn* < ε/2, so that *a fm*,*<sup>n</sup>* − *g* < ε, and (B.620) follows. Hence *f* ∈ *D*(*a*−). Finally, since *a*<sup>−</sup> *f* = lim*m*,*<sup>n</sup> a fm*,*<sup>n</sup>* one has *a*<sup>−</sup> *f* = *g* by (B.620), or *a*<sup>−</sup> *f* = lim*<sup>n</sup> a fn* by definition of *g*. Thus *a*<sup>−</sup> is closed.

#### §B.15. The spectral theorem

By (B.319), von Neumann algebas like *W*∗(*a*) are complete under strong convergence of *nets* (rather than merely *sequences*), and if some net is monotone increasing (or decreasing) and bounded, the strong limit equals the supremum (or infimum), as in Proposition B.98. This yields operatorial versions of (B.40) - (B.44):

$$\mathcal{e}\_U = \sup \{ f(a) \mid f \in \mathcal{C}\_c(U), 0 \le f \le 1\_{\sigma(a)} \};\tag{B.621}$$

$$e\_K = \inf\{f(a) \mid f \in C(\sigma(a)), 0 \le f \le 1\_{\sigma(a)}, f\_{|K} = 1\_K\};\tag{B.622}$$

$$e\_A = \inf \{ e\_U \mid U \supseteq A, U \in \mathcal{O}(\sigma(a)) \}; \tag{B.623}$$

$$\mathcal{I} = \sup \{ e\_K \mid K \subseteq A, K \in \mathcal{X}'(\mathfrak{\sigma}(a)) \}, \tag{B.624}$$

where *U* ∈ O(σ(*a*)) is open, *K* ∈ K (σ(*a*)) is compact, and *A* ⊂ σ(*a*) is Borel.

Notes 643

## §B.16. Abelian <sup>∗</sup>-algebras in *B*(*H*)

For an alternative proof of Proposition B.106, one observes that

$$\Psi \to \int\_0^1 f \, \Psi = \int\_0^1 b \, \Psi = \langle \sqrt{|\Psi|}, b\,\Psi/\sqrt{|\Psi|} \rangle \tag{B.625}$$

defines a *bounded* functional on *L*2(0,1) *seen as a dense subspace of L*1(0,1), and use the duality *L*1(0,1)<sup>∗</sup> ∼= *L*∞(0,1). Indeed, using Cauchy–Schwarz, one has

$$\left| \int\_{0}^{1} f \, \Psi \right| = \left| \langle \sqrt{|\Psi|}, b\Psi/\sqrt{|\Psi|} \rangle \right| \le \|b\| \|\|\sqrt{|\Psi|}\|\_{2} \|\Psi/\sqrt{|\Psi|}\|\_{2} = \|b\| \|\|\Psi\|\|\_{1} \text{ (B.626)}$$

## §B.17. Classification of maximal abelian <sup>∗</sup>-algebras in *B*(*H*)

Theorem B.118 goes back to von Neumann (1931); for the details of the second proof see Kadison & Ringrose (1986), §9.4, or, very lucidly, Stevens (2016).

#### §B.20. The trace

The trace is often neglected in functional analysis books, except when these tend to quantum mechanics (Reed & Simon, 1972) or to operator algebras (Pedersen, 1989). Eqs. (B.476) - (B.477) and (B.496) reflect the function space dualities

$$\ell\_0(\mathbb{N})^\* \cong \ell^1(\mathbb{N});\tag{\mathbb{B}.627}$$

$$\ell^1(\mathbb{N})^\* \cong \ell^\infty(\mathbb{N});\tag{\mathbb{B}.628}$$

$$
\ell^2(\mathbb{N})^\* \cong \ell^2(\mathbb{N}).\tag{\mathbb{B}.629}
$$

Similar to the *<sup>p</sup>*-spaces, one has Banach spaces *Bp*(*H*) residing in *B*0(*H*) for each 1 ≤ *p* < ∞, called *Schatten–von Neumann ideals*, see e.g. Simon (2005).

#### §B.21. Spectral theory for unbounded self-adjoint operators

Our approach to unbounded operators via the bounded transform combines ideas from Kaufman (1978), Woronowicz (1991), Woronowicz & Napiorkowski (1992), ´ Schmudgen (2012), and Koliha (2014). The proof of Theorem B.159 via Lemma ¨ B.160 (due to Nelson, 1959), was suggested to the author by Nigel Higson. The last part of §B.21 was inspired by Lemma 5.2.8 in Pedersen (1989), in which we have simply replaced the Cayley transform by the bounded transform.

The idea of affiliating closed operators to von Neumann algebra goes back to von Neumann; our brief treatment is hopefully more appealing than the elaborate constructions in Kadison & Ringrose (1983), §5.6. A number of details were supplied in the M.Sc Thesis of Christian Budde (2015); see also Budde & Landsman (2016).

For general C\*-algebras *A*, the multiplier algebra consists of all maps *m* : *A* → *A* for which there exists an adjoint *n* ≡ *m*<sup>∗</sup> : *A* → *A* such that *b*∗*m*(*a*) = *n*(*b*)∗*a*. Such maps are automatically linear and bounded, and *M*(*A*) is a C\*-algebra itself as a subalgebra of the Banach space *B*(*A*) of all bounded linear maps on *A*, enriched with the adjoint *m*<sup>∗</sup> = *n*. See, e.g., Lance (1995), or §C.10 below. For commutative C\*-algebras this reduces to the definition in the main text, which dates from Wang (1961). For unbounded multipliers see Woronowicz (1991) and Lance (1995); Woods (1979) treats the bounded case.

## Appendix C Operator algebras

This appendix provides a short course in operator algebras, building on the previous appendix. Indeed, there is surprisingly little algebra in the subject (so that there are hardly any prerequisites in that direction), and quite a lot of functional analysis, involving both operators on Hilbert space and more general Banach space theory.

Traditionally, the field of operator algebras has had two branches: C\*-algebras and von Neumann algebras. Although historically speaking the latter (invented by von Neumann in 1930) preceded the former (introduced by Gelfand and Naimark in 1943), the logical order of presentation is the opposite, since von Neumann algebras turned out to be special cases of C\*-algebras (with additional structure). Furthermore, for reasons in the foundations of quantum mechanics (as explained in the main text), beside von Neumann algebras we will discuss a few lesser known special cases of C\*-algebras, such as *scattered* C\*-algebras and AW\*-algebras.

#### C.1 Basic definitions and examples


$$\|\|ab\|\| \le \|a\|\|\|b\|\ (a, b \in A). \tag{C.1}$$


$$\left\|\left|a^\*a\right\|\right\| = \left\|a\right\|^2 \text{ ( $a \in A$ )}.\tag{C.2}$$

With the same proof as (A.22), these axioms imply

$$\|\|a^\*\|\| = \|a\|. \tag{C.3}$$

645

The three main examples (at least for a first orientation) are:


ϕ(*ab*) = ϕ(*a*)ϕ(*b*); (C.4)

$$
\mathfrak{g}(a^\*) = \mathfrak{g}(a)^\*. \tag{C.5}
$$

*2. An* isomorphism *between two C\*-algebras is an invertible homomorphism. If A and B are isomorphic as C\*-algebras in this sense, we write A* ∼= *B.*

It follows from linear algebra that the set-theoretic inverse of an invertible linear map ϕ : *A* → *B* is automatically linear. It is similarly easy to show that the inverse of an invertible homomorphism is itself a homomorphism, but it is a deeper fact about C\*-algebras that an isomorphism is automatically isometric (and hence has an isometric inverse); see Theorem C.62. Furthermore, if *B* = C, then the property ϕ(*a*∗) = ϕ(*a*)∗ follows from the other conditions on a homomorphism.

The following notion, originally inspired by quantum mechanics (and turned into mathematics by von Neumann), gives a geometric flavor to operator algebras.

Definition C.3. *A* state *on a C\*-algebra A is a bounded linear map* ω : *A* → C *that satisfies:*


If *A* has a unit, the definition of a state considerably simplifies.

Lemma C.4. *Let A be a C\*-algebra with unit and let* ω : *A* → C *be a linear map. Then* ω *is positive iff it is bounded and satisfies* ω = ω(1*A*)*.*

The proof requires some positivity theory in C\*-algebras, so we postpone it to §C.7, but as of now, we immediately infer that in the unital case we have:

Proposition C.5. *A linear map* ω : *A* → C *on a unital C\*-algebra is a state iff* ω *is positive and satisfies* ω(1*A*) = 1*, and hence iff* ω *is bounded with* ω = ω(1*A*) = 1*.*

Using the Banach–Alaoglu Theorem B.48, this implies that the *state space S*(*A*) of a unital C\*-algebra *A*, i.e., the set of all states on *A*, is a compact convex subset of *A*∗ in its *w*∗-topology. Defining the *pure state space P*(*A*) of *A* as the extreme boundary ∂*eS*(*A*), the Krein–Milman Theorem B.50 almost immediately implies:

Theorem C.6. *Let A be a C\*-algebra with unit, having state space S*(*A*) *and pure state space P*(*A*) = ∂*eS*(*A*)*. Then P*(*A*) = 0/ *and S*(*A*) = co(*P*(*A*))−*.*

In words, C\*-algebras have sufficiently many pure states to approximate general states arbitrarily well, at least in the *w*∗-topology (of "expectation values").

The only complication in applying Theorem B.50 to *K* = *S*(*A*) ⊂ *A*<sup>∗</sup> is that *A* is a complex Banach space, but the situation may be reduced to the real Banach space

$$A\_{\rm sa} = \{ a \in A \mid a^\* = a \}. \tag{C.6}$$

Lemma C.7. *Let A be a C\*-algebra with unit. If* ω ∈ *S*(*A*)*, then* ω(*a*∗) = ω(*a*)*.*

*Proof.* Using Definition C.3.2 and eq. (C.2), for any *a*<sup>∗</sup> = *a* and *t* ∈ R we have

$$\left|\left|\mathfrak{o}(a+it)\right|\right|^2 \le \left\|a+it\right\|^2 = \left\|(a-it)(a+it)\right\| = \left\|a^2+t^2\right\| \le \left\|a\right\|^2+t^2.\tag{C.7}$$

Writing <sup>ω</sup>(*a*) = <sup>α</sup> <sup>+</sup>*i*β, where <sup>α</sup>,<sup>β</sup> <sup>∈</sup> <sup>R</sup>, this gives <sup>α</sup><sup>2</sup> <sup>+</sup>β<sup>2</sup> <sup>+</sup>2β*<sup>t</sup>* ≤ *a*<sup>2</sup> for all *<sup>t</sup>* <sup>∈</sup> R, which forces β = 0. This proves the claim for self-adjoint *a*. For the general case, one uses the following decomposition of *a* as a sum of two self-adjoint operators:

$$a = b + \text{ic } \left( b^\* = b, c^\* = c \right); \tag{C.8}$$

$$b = \frac{1}{2}(a + a^\*), \ c = -\frac{1}{2}i(a - a^\*). \tag{C.9}$$

Consequently, we may restrict a state ω ∈ *S*(*A*) to a real-linear functional

$$\mathfrak{o}\_{\mathbb{R}} = \mathfrak{o}\_{\mathbb{A}\_{\text{sa}}} : A\_{\text{sa}} \to \mathbb{R} \tag{C.10}$$

that satisfies <sup>ω</sup>(1*A*) = 1 and <sup>ω</sup>(*a*2) <sup>≥</sup> 0 for any *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*sa, where we used Theorem C.52 below to reformulate the positivity condition on states in terms of self-adjoint operators alone. Conversely, we may extend a state ω<sup>R</sup> on *A*sa to a state ω on *A* by

$$a\mathfrak{o}(a) = a\mathfrak{o}\_{\mathbb{R}}(b) + i a\mathfrak{o}\_{\mathbb{R}}(c),\tag{C.11}$$

assuming (C.8) - (C.9). We then have ω = ωR = 1, since obviously ωR ≤ ω = 1 (since its sup-norm is computed on fewer operators), but also ω(1*A*) = 1. Thus we may regard *S*(*A*) as a compact convex set in the real Banach space *A*∗ sa rather than in the complex Banach space *A*∗, and Theorem B.50 applies. Alternatively, one could have extended the latter to the complex case, which is possible with a similar (lack of) effort as in the procedure above.

#### C.2 Gelfand isomorphism

The example *A* = *C*0(*X*) of a commutative C\*-algebra given in the previous section is more than that; as proved in the very first (1943) paper on C\*-algebras by Gelfand and Naimark (despite whom one often speaks of , it is generic.

Theorem C.8. *Every commutative C\*-algebra A is isomorphic to C*0(*X*) *for some locally compact Hausdorff space X, which is unique up to homeomorphism.*

The proof is technically intricate at points, but the main idea is quite simple:


$$
\hat{a}(\mathfrak{o}) = \mathfrak{o}(a), \ (a \in \mathcal{A}, \ \mathfrak{o} \in \Sigma(\mathcal{A})).\tag{C.12}
$$


This picture becomes even more compelling from the following observation:

Lemma C.9. *For any (i.e. not necessarily commutative) C\*-algebra A we have* Σ(*A*) ⊂ *A*∗*. Furthermore, for any* ω ∈ Σ(*A*)*,*

$$\|\|\boldsymbol{\varrho}\|\|=1,\tag{C.13}$$

*and if A has a unit,* 1*A, then also*

$$
\mathfrak{so}(1\_A) = 1.\tag{C.14}
$$

In other words, multiplicative linear functionals on *A* are automatically continuous (recall that *A*<sup>∗</sup> is the Banach space of continuous linear maps from *A* to C, see §B.9).

Throughout the rest of this section we restrict all proofs to the unital case; the general case may be handled by the technique of unitization to be discussed in §C.6.

*Proof.* Let ω ∈ Σ(*A*). By multiplicativity, ker(ω) is a two-sided ideal in *A*. Trivially, for any *a* ∈ *A*, we have *a*−ω(*a*)· 1*<sup>A</sup>* ∈ ker(ω). If this element were invertible, then ker(ω) would contain the unit 1*<sup>A</sup>* and hence would coincide with *A*, contradicting the definition of Σ(*A*) (which requires ω to be nonzero). Hence ω(*a*) ∈ σ(*a*). By the spectral radius formula (B.255) we have |ω(*a*)|≤*a*, whence ω ∈ *A*∗.

Furthermore, ω(1*A*)<sup>2</sup> = ω(1*A*), whence ω(1*A*) = 1 or 0, the latter being excluded since it would imply that ω(*a*) = 0 for all *a* ∈ *A*. This gives (C.14) (which also follows from Lemma C.4, given Lemma C.11 below), which in turn gives (C.13). -

The Gelfand topology on Σ(*A*) coincides with the weak∗ topology inherited from *A*∗, which is simply the topology of pointwise convergence (i.e. ωλ → ω iff ωλ (*a*) → ω(*a*) for each *a* ∈ *A*), and the Gelfand transform *a* → *a*ˆ is (by abuse of notation) the image of *a* in *A*∗∗ under the canonical injection *A* → *A*∗∗ appearing in Proposition B.44, restricted (as a function on *A*∗) to the subset Σ(*A*) ⊂ *A*∗. From this perspective, continuity of ˆ*a* immediately follows from Proposition B.46.

This picture of the Gelfand topology also has a technical advantage, for we infer:

#### Lemma C.10. *If A is unital, then its Gelfand spectrum* Σ(*A*) *is compact Hausdorff.*

*Proof.* By Lemma C.9, Σ(*A*) lies in the unit ball of *A*∗, which by the Banach– Alaoglu Theorem is compact in its weak∗ topology. So we are ready if we show that Σ(*A*) is a weak∗-closed subset of *A*∗, which is obvious from its definition: if ωλ → ω, then for any *a* ∈ *A* we obviously have

$$\mathfrak{so}(ab) = \lim\_{\lambda} \mathfrak{o}\_{\lambda}(ab) = \lim\_{\lambda} \mathfrak{o}\_{\lambda}(a)\mathfrak{o}\_{\lambda}(b) = \mathfrak{o}(a)\mathfrak{o}(b). \tag{C.15}$$

We know show that the Hausdorff property of Σ(*A*) is inherited from *A*∗. A subbasis of its weak∗ topology is given by sets of the form

$$U\_a^\varepsilon(\mathfrak{p}) = \{ \mathfrak{p} \in A^\*, |\mathfrak{p}(a) - \mathfrak{p}(a)| < \varepsilon \},\tag{C.16}$$

where *a* ∈ *A*, ϕ ∈ *A*∗, and ε > 0. Replacing ρ ∈ *A*<sup>∗</sup> by ρ ∈ Σ(*A*) we thus obtain a subbasis of the Gelfand topology. If ω and ω are distinct points in Σ(*A*), there exists *a* ∈ *A* such that ω(*a*) = ω (*a*). Taking some 0 < ε < |ω(*a*) − ω (*a*)|/2, the two points in question are separated by the opens *U*<sup>ε</sup> *<sup>a</sup>* (ω) and *U*<sup>ε</sup> *<sup>a</sup>* (ω ). -

It is immediate from the definition of Σ(*A*) that *a* → *a*ˆ is an algebra homomorphism, since we have

$$
\hat{a}\hat{b}(\mathfrak{a}) = \mathfrak{a}(ab) = \mathfrak{a}(a)\mathfrak{a}(b) = \hat{a}(\mathfrak{a})\hat{b}(\mathfrak{a}) = (\hat{a}\cdot\hat{b})(\mathfrak{a}).\tag{C.17}
$$

The fact that the Gelfand transform preserves the involution follows from:

Lemma C.11. *If* ω ∈ Σ(*A*)*, then* ω(*a*∗) = ω(*a*)*, and hence a* .<sup>∗</sup> = (*a*ˆ)∗*.*

*Proof.* Using (C.14) and (C.2), the proof is the same as for Lemma C.7. -

The hard part of the proof of Theorem C.8 is isometricity of the Gelfand transform:

$$\|\hat{a}\|\_{\circ} = \|a\|. \tag{C.18}$$

As always, isometricity obviously implies *injectivity*. Surprisingly, using the Stone– Weierstrass Theorem B.51, in this case isometricity also yields *surjectivity* of the map *<sup>a</sup>* → *<sup>a</sup>*ˆ. Namely, if we take *<sup>X</sup>* <sup>=</sup> <sup>Σ</sup>(*A*), and *<sup>B</sup>* to be the image *<sup>A</sup>*<sup>ˆ</sup> of *<sup>A</sup>* under the Gelfand transform, then the conditions on *B* in Theorem B.51 are easily verified. Assuming (C.18), this image is obviously closed, so that *A*ˆ =*C*(Σ(*A*)). With injectivity also implied by (C.18), it follows that the Gelfand transform is an isomorphism.

It remains to prove (C.18), which conceptually is a conjunction of two equalities:

$$\|\|\hat{a}\|\|\_{\infty} = r(a);\tag{C.19}$$

$$\|a\| = r(a) \ (a^\* = a),\tag{C.20}$$

where *r*(*a*) = sup{|λ|,λ ∈ σ(*a*)} is the spectral radius of *a*, see Theorem B.84. These immediately yield (C.18) for self-adjoint *a*, from which the general case follows from (C.2), noting that *a*∗*a* is self-adjoint for any *a*: asuming (C.19) - (C.20) as well as the homomorphism property of the Gelfand transform, we compute

$$\|\|\hat{a}\|\|\_{\infty}^{2} = \|\|\hat{a}^\*\hat{a}\|\|\_{\infty} = \|\widehat{a^\*a}\|\|\_{\infty} = \|\|a^\*a\| = \|\|a\|\|^2. \tag{C.21}$$

Since (C.20) just repeats (B.257), we already know it is true for general C\*-algebras (so far, with unit). As we shall now show, (C.19) holds in any commutative Banach algebra with unit. The key is the following lemma.

Lemma C.12. *Let A be a commutative Banach algebra with unit and let a* ∈ *A. For any* λ ∈ σ(*a*) *there is an element* ω ∈ Σ(*A*) *such that* λ = ω(*a*)*.*

Granted this, and using the proof of Lemma C.9 as well as (B.253), we obtain

$$
\sigma(a) = \sigma(\hat{a}),
\tag{C.22}
$$

for any *a* ∈ *A*. Given (B.254), this yields (C.19) and hence the Gelfand isomorphism.

There are two approaches to our crucial Lemma C.12, each having its own merits. The first and best known proof, going back to Gelfand himself, relies on the theory of (maximal) ideals in Banach algebras. It is based on the following identification:

Proposition C.13. *Let A be a commutative Banach algebra with unit. There is a bijective correspondence between* Σ(*A*) *and the set* M(*A*) *of maximal ideals in A,*

$$
\mathfrak{o} \leftrightarrow \ker(\mathfrak{o}).\tag{\mathbb{C}.23}
$$

This will be proved in §C.8 below, which also contains the relevant background.

It implies Lemma C.12, as follows: if λ ∈ σ(*a*), then by definition *a*−λ is not invertible in *A*, so that *J* = {(*a*−λ)*b* | *b* ∈ *A*} is an ideal in *A*. By Zorn's Lemma (or Hausdorff's Maximality Theorem), applied to the partially ordered set of all proper ideals in *A* that contain *J*, ordered by inclusion), *J* is contained in some maximal ideal, so that *J* ⊆ ker(ω) for some ω ∈ Σ(*A*). Since *a*−λ ∈ *J* (take *b* = 1*A*), from (C.14) we obtain ω(*a*) = λ. Note the non-constructive nature of this argument!

The other line of proof, due to Kadison, uses a different characterization of Σ(*A*):

Proposition C.14. *Let A be a commutative C\*-algebra with unit. Then the Gelfand spectrum* Σ(*A*) *coincides with the pure state space P*(*A*)*.*

Recall Definition 1.10 and Theorem C.6; the pure state space *P*(*A*) = ∂*eS*(*A*) of a C\*-algebra *A* is defined as the boundary of the state space of *A*. The argument that instantly delivers Lemma C.12 from Proposition C.14, then, is as follows:

Proposition C.15. *Let A be a C\*-algebra with unit. For any normal element a* ∈ *A (i.e., aa*<sup>∗</sup> = *a*∗*a) and* λ ∈ σ(*a*)*, there is a pure state* ω ∈ *P*(*A*) *such that* ω(*a*) = λ*.*

The proof of both results uses some positivity theory for C\*-algebras, which is systematically developed in §C.7 below. Here, we just need that *a* ∈ *A* is positive, written *a* ≥ 0, iff *a* = *b*∗*b* for some *b* ∈ *A*, iff *a* is self-adjoint with σ(*a*) ⊂ [0,∞).

We write *a* ≥ *b* or *b* ≤ *a* if *a*−*b* is positive. Also, a linear functional ω : *A* → C is called positive iff ω(*a*) ≥ 0 for all *a* ≥ 0, and we write ω ≥ ϕ or ϕ ≤ ω if ω −ϕ ≥ 0.

Let us note that the proofs of these results in §C.7 use some Gelfand theory, but this use is limited to Theorem C.25, which could have been proved a la Theorem ` C.24, whose proof *derives* the Gelfand isomorphism in the special case at hand. Therefore, the use of Propositions C.14 and C.15 in the proof of (C.18) and hence of Theorem C.8 does not render this line of proof of the latter circular.

In particular, the proof of Proposition C.14 relies on:

Lemma C.16. *If a*<sup>∗</sup> = *a* ∈ *A there is a number t* ≥ 0 *such that t* ±*a* ≥ 0*.*

*Proof.* Since σ(*a*) ⊂ R is compact (see Corollary C.27 and Theorem B.84), we have σ(*a*) ⊆ [−*t*,*t*] for some *t* ≥ 0. It is clear from the definition of σ(*a*) that σ(*t* ±*a*) = *t* ±σ(*a*), which yields the lemma by the criterion for positivity just stated. -

We now prove Proposition C.14.

*Proof.* It is clear from Lemma C.11 and eq. (C.14) that ω ∈ Σ(*A*) is a state. To show that ω is pure, we use the fact that for any state ω ∈ *S*(*A*), the expression

$$
\langle b, a \rangle = \mathfrak{a}(b^\* a) \tag{C.24}
$$

defines an hermitian form on *A*; the easy proof again uses use Lemma C.11. Applying Cauchy–Schwarz with *b* 1*<sup>A</sup>* and using 1<sup>∗</sup> *<sup>A</sup>* = 1*<sup>A</sup>* = 1<sup>2</sup> *<sup>A</sup>* gives

$$|\mathfrak{o}(a)|^2 \le \mathfrak{o}(a^\*a). \tag{C.25}$$

Now suppose that ω = λω<sup>1</sup> + (1− λ)ω<sup>2</sup> with ω*<sup>i</sup>* ∈ *S*(*A*) and λ ∈ (0,1). Applying (C.25) (in the opposite direction) to ω<sup>1</sup> and ω<sup>2</sup> gives

$$|a\bullet(a^\*a) \ge \lambda |a\bullet(a)|^2 + (1-\lambda)|a\bullet(a)|^2. \tag{C.26}$$

On the other hand, multiplicativity of ω gives

$$\mathfrak{so}(a^\*a) = \lambda^2|a\mathfrak{o}\_1(a)|^2 + \lambda(1-\lambda)(a\mathfrak{o}\_1(a)\overline{\mathfrak{o}\_2(a)} + a\mathfrak{o}\_2(a)\overline{\mathfrak{o}\_1(a)}) + (1-\lambda)^2|a\mathfrak{o}\_2(a)|^2.$$

Subtracting this from (C.26) gives the inequality 0 ≥ λ(1−λ)|ω1(*a*)−ω2(*a*)| 2, so that ω<sup>1</sup> = ω2, and hence ω is pure by definition. This shows that Σ(*A*) ⊆ *P*(*A*).

To prove the converse inclusion, we need another lemma.

Lemma C.17. *Let* ω ∈ *P*(*A*) *be a pure state on A. If* τ : *A* → C *is a linear functional such that* 0 ≤ τ ≤ ω*, then we can find a scalar s* ∈ [0,1] *such that* τ = *s*ω*.*

*Proof.* We assume τ = 0 and τ = ω (otherwise the claim is trivially true). By Lemma C.16, this implies τ(1*A*) = 0 and τ(1*A*) = 1. For if τ(1*A*) = 0, then for *a*<sup>∗</sup> = *a* we find *t* as in Lemma C.16, so that *t* ±*a* ≥ 0 and hence 0 ≤ τ(*t* ±*a*) = ±τ(*a*). Hence τ(*a*) = 0 on each self-adjoint *a*, which forces τ = 0 by the usual decomposition (C.8). If τ(1*A*) = 1, we apply a similar argument to the positive functional ω − τ. Therefore, *t* = 1 − τ(1*A*) satisfies *t* ∈ (0,1), and defining ω<sup>1</sup> = (ω − τ)/*t* and ω<sup>2</sup> = τ/τ(1*A*) we obtain a decomposition ω = *t*ω<sup>1</sup> + (1 − *t*)ω2. Since ω is pure, this gives ω<sup>1</sup> = ω<sup>2</sup> = ω and hence τ = τ(1*A*)ω. Clearly, 0 ≤ τ ≤ ω enforces 0 ≤ τ(1*A*) ≤ 1, so the claim follows with *s* = τ(1*A*). -

We now prove that ω ∈ *P*(*A*) is multiplicative on arbitrary *a* ∈ *A*, and *b* ∈ *A* such that (for the moment) 0 ≤ *b* ≤ 1*A*. Define ω*<sup>b</sup>* : *A* → C by ω*b*(*a*) = ω(*ab*). Then 0 ≤ ω*<sup>b</sup>* ≤ ω: taking *b* = *c*∗*c*, the first inequality 0 ≤ ω*<sup>b</sup>* follows from

$$\mathfrak{o}\_b(a^\*a) = \mathfrak{o}(c^\*ca^\*a) = \mathfrak{o}((ac)^\*ac) \ge 0,\tag{C.27}$$

since *A* is abelian, and the second is analogous, using the fact that 0 ≤ *b* ≤ 1*<sup>A</sup>* implies 0 ≤ 1*<sup>A</sup>* −*b* ≤ 1*A*. Therefore, Lemma C.17 gives ω*<sup>b</sup>* = *s*ω with *s* = ω*b*(1*A*) = ω(*b*).

For general 0 = *b* ≥ 0, we rewrite *b* as *b* = *b* ·(*b*/*b*), and use linearity of ω and the previous result to obtain multiplicativity. For general self-adjoint *b* we use Lemma C.53, and finally we use (C.8). -

At last, we are now in a position to prove Proposition C.15, so let *a* ∈ *A* be normal.

*Proof.* Let *C*∗(*a*) be the commutative C\*-algebra generated by *a* (and hence *a*∗) and 1*<sup>A</sup>* within *A*; as in Theorem C.25 below, this is the norm-closure of all polynomials in *a* and *a*∗, and *C*(σ(*a*)) ∼= *C*∗(*a*) via the map *f*(λ,λ) → *f*(*a*,*a*∗). Using Proposition C.14, define a pure state ωλ on *C*∗(*a*) by linear and multiplicative extension of ωλ (1*A*) = 1, ωλ (*a*) = λ, and ωλ (*a*∗) = λ, i.e., ωλ (*f*(*a*,*a*∗)) = *f*(λ,λ).

Since ωλ = 1, Hahn–Banach (Corollary B.41, with *V A* and *W C*∗(*a*)) yields a linear extension ω <sup>λ</sup> : *A* → C of ωλ , which is in fact a state by Lemma C.4. To show that ω <sup>λ</sup> may be chosen to be pure also on *A*, let *S*<sup>λ</sup> (*A*) ⊂ *S*(*A*) be the set of all states on *A* that extend ωλ . This is a nonempty weak∗-closed and hence weak∗-compact convex subset of *S*(*A*), which by the Krein–Milman Theorem B.50 has nonempty boundary ∂*eS*<sup>λ</sup> (*A*). It is easy to show that ∂*eS*<sup>λ</sup> (*A*) ⊂ ∂*eS*(*A*) = *P*(*A*): for ω ∈ ∂*eS*<sup>λ</sup> (*A*), suppose ω = *t*ω<sup>1</sup> + (1−*t*)ω2, with *t* ∈ (0,1) and ω*<sup>i</sup>* ∈ *S*(*A*). Since ω|*C*∗(*a*) = ωλ is pure, we have ω1|*C*∗(*a*) = ω2|*C*∗(*a*) = ωλ , or ω*<sup>i</sup>* ∈ *S*<sup>λ</sup> (*A*). But ω was assumed pure in *S*<sup>λ</sup> (*A*), so that ω<sup>1</sup> = ω<sup>2</sup> = ω, i.e., ω ∈ ∂*eS*(*A*). Hence if we choose ω <sup>λ</sup> ∈ ∂*eS*<sup>λ</sup> (*A*), then the extension ω <sup>λ</sup> of ωλ is also pure on *A*. -

The following ingredients are still missing from the proof of Theorem C.8:


We start with the first issue, which we fill in more broadly than needed for the proof of Theorem C.8, namely, as part of a broader picture called *Gelfand duality* (which will fall into place if one uses the language of category theory, see Appendix E).

#### C.3 Gelfand duality

Theorem C.8 is a consequence of the following two propositions.

Proposition C.18. *Let A and B be unital commutative C\*-algebras. Then*

$$
\boldsymbol{\upvarphi} = \boldsymbol{\upalpha}^\*, \tag{\text{C.28}}
$$

*where* α∗(ω) = ω ◦α*, establishes a bijective correspondence between unital homomorphisms* α : *A* → *B and continuous maps* ϕ : Σ(*B*) → Σ(*A*)*.*

*In particular,* Σ(*A*) *and* Σ(*B*) *are homeomorphic iff A and B are isomorpic.*

*Proof.* Since α(*ab*) = α(*a*)α(*b*), if ω ∈ Σ(*B*) it is clear that then α∗(ω) ∈ Σ(*A*). Conversely, denoting the pertinent Gelfand transforms by *GA* : *A* →*C*(Σ(*A*)) and

*GB* : *A* → *C*(Σ(*B*)), given ϕ : Σ(*B*) → Σ(*A*), we define α : *A* → *B* by

$$
\mathfrak{a} = G\_B^{-1} \circ \mathfrak{g}^\* \circ G\_A,\tag{\text{C.29}}
$$

where ϕ<sup>∗</sup> : *C*(Σ(*A*)) → *C*(Σ(*B*)) is the pullback of ϕ (i.e., ϕ∗(*f*) = *f* ◦ϕ).

It is easy to verify that given ϕ, the map α defined in (C.29) returns ϕ through (C.28), whereas given α, the map ϕ defined in (C.28) returns α through (C.29). -

Proposition C.19. *For any compact Hausdorff space X, the evaluation map*

$$\text{ev}: X \to \Sigma(\mathcal{C}(X));\tag{C.30}$$

$$\text{ev}\_x(f) = f(\mathbf{x}),\tag{\text{C.31}}$$

*is a homeomorphism, so that*

$$
\Sigma(\mathcal{C}(X)) \cong X.\tag{C.32}
$$

*Proof.* Injectivity of ev immediately follows from Urysohn's lemma (which applies because a compact Hausdorff space is normal), which implies that *C*(*X*) separates points on *X* (i.e., for all *x* = *y* there is an *f* ∈ *C*(*X*) for which *f*(*x*) = *f*(*y*)).

To prove surjectivity, suppose there is ω ∈ Σ(*C*(*X*)) such that ω = ev*<sup>x</sup>* for all *x* ∈ *X*. Now ker(ω) = ker(ev*x*) would imply ω = ev*<sup>x</sup>* (because ω(*f*) = λ then implies *f* − λ · 1*<sup>X</sup>* ∈ ker(ω), and hence *f*(*x*) = λ, and *vice versa*), so ker(ω) = ker(ev*x*). Since ev*<sup>x</sup>* ∈ Σ(*C*(*x*)), and ω ∈ Σ(*C*(*x*)) by assumption, by Proposition C.13 both kernels are maximal ideals in*C*(*X*), and hence ker(ω) ⊂ ker(ev*x*) is impossible (and so is the opposite inclusion). Therefore, for each *x* there is a function *fx* ∈ ker(ω) for which *fx*(*x*) = 0 (for otherwise *f*(*x*) = 0 for all *f* ∈ ker(ω), so that ker(ω) ⊆ ker(ev*x*)). Redefining *fx* by a phase if necessary, we may assume that *fx*(*x*) > 0, and taking the real part of *fx* if necessary, we may also assume that *f* is real-valued.

For each *x*, the set *Ux* where *fx* > 0 is open, because *f* is continuous. This gives a covering {*Ux*}*x*∈*<sup>X</sup>* of *X*, which by compactness has a finite subcovering {*Uxn* }*n*=1,...,*N*. Then define the function

$$f = \sum\_{n=1}^{N} f\_{\chi\_n},\tag{C.33}$$

which is strictly positive by construction, so that it is invertible. But ker(ω) is an ideal, so that, with all *fxn* ∈ ker(ω) (since all *fx* ∈ ker(ω)) also *f* ∈ ker(ω). But an ideal containing an invertible element must contain 1*<sup>X</sup>* and hence coincides with *C*(*X*), contradicting the fact that ker(ω) was maximal. Hence ev is surjective.

Finally, to prove that ev is a homeomorphism, we equip *X* with the topology induced by ev, in which the open sets are of the form ev−1(*U*), with *U* open in Σ(*C*(*X*)) in the Gelfand topology. We claim that this new topology on *X* is weaker than the original one (this terminology includes the possibility that the two topologies in question coincide). Namely, for *<sup>f</sup>* <sup>∈</sup>*C*(*X*) one has <sup>ˆ</sup>*<sup>f</sup>* ◦ev <sup>=</sup> *<sup>f</sup>* . Therefore, since the Gelfand topology on Σ(*C*(*X*)) is the weakest topology for which all Gelfand transforms ˆ*f* are continuous, the new topology on *X* is the weakest topology for which all *f* are continuous. But *f* was already continuous with respect to the given topology, so the claim follows. Without proof we now state a result from topology:

Lemma C.20. *If a set X is Hausdorff in some topology* O1(*X*) *and compact in a topology* O2(*X*)*, and if* O1(*X*) ⊆ O2(*X*)*, then* O1(*X*) = O2(*X*)*.*

Since *X* is in fact compact and Hausdorff in both topologies, we conclude from this lemma that the new topology on *X* must coincide with the original one. -

Uniqueness of the Gelfand spectrum up to homeomorphism follows from Propositions C.18 and C.19: if *A* is a unital commutative C\*-algebra for which *A* ∼= *C*(*X*) as well as *A* ∼= *C*(*Y*), then applying Σ and using (C.32) makes *X* and *Y* both homeomorphic to Σ(*A*), and hence to each other.

With minor changes, the proof of Proposition C.19: applies also to "wellbehaved" manifolds, by which we mean *second countable smooth locally compact Hausdorff manifolds*. These are the ones encountered in physics (especially in classical mechanics); we need this for Theorem 3.10 in the main text. Such manifolds admit partitions of unity subordinate to any given cover (*U*<sup>λ</sup> ) that are locally finite as well as countable, i.e., sequences of smooth functions χ*<sup>n</sup>* :→ [0,1] such that:


Furthermore, Σ(*C*∞(*X*)) is defined as for any complex associative algebra *A*, i.e., as the set of nonzero multiplicative linear maps <sup>ω</sup> : *<sup>C</sup>*∞(*X*) <sup>→</sup> <sup>C</sup>.

Proposition C.21. *For any second countable smooth locally compact Hausdorff manifold X, the evaluation map* ev : *<sup>X</sup>* <sup>→</sup> <sup>Σ</sup>(*C*∞(*X*)) *in* (C.31) *is a bijection.*

*Proof.* Since *X* is not necessarily compact, we cannot use Urysohn's Lemma directly to prove that*C*∞(*X*) separates points of *X* (so that ev is injective), but this time, if *U* ⊆ *X* is open and *F* ⊂ *U* is closed, there exists a smooth function χ : *X* → [0,1] such that χ = 1 on *F* and χ = 0 on *X*\*U*. Indeed, {*U*,*X*\*F*} is an open cover of *X*, and if (χ*<sup>U</sup>* ,χ*X*\*F*) is a partition of unity subordinate to this cover, χ = χ*<sup>U</sup>* will do. Now for *x* = *y*, take *F* = {*x*} and use the Hausdorff property to separate (*x*, *y*) by disjoint open sets (*U*,*V*), and we have χ(*x*) = 1 whilst χ(*y*) = 0.

The proof of surjectivity is the same as for *C*(*X*), including the proof that ker(ω) is a maximal ideal in *C*∞(*X*), until the point (C.33) is reached. Here compactness is no longer available, so that we need to replace (C.33) by the expression

$$f = \sum\_{n} c\_{n} \mathbb{X}\_{n} f\_{\mathbf{x}\_{n}},\tag{\text{C.34}}$$

where (χ*n*) is a smooth partition of unity subordinate to the cover (*Ux*), for each *n* ∈ N, *fxn* is picked by no. 3 in the list of properties of a partition of unity listed above, and the coefficients *cn* are chosen so that 0 <sup>&</sup>lt; *cn* <sup>&</sup>lt; (*n*2χ*<sup>n</sup> fxn* ∞)−<sup>1</sup> (note that χ*<sup>n</sup>* and hence χ*<sup>n</sup> fxn* has compact support and is continuous, so that it is bounded). Since ∑*n*(1/*n*2) < ∞, the insertion of the *cn* makes *f* bounded and the sum (C.34) uniformly convergent. which is necessary to pull ω through the sum so as to prove that *<sup>f</sup>* <sup>∈</sup> ker(ω), as follows. Since the sup-norm is not defined on all of *<sup>C</sup>*∞(*X*), we need a little argument here. Take *t* > *f* ∞, so that *t* · 1*<sup>X</sup>* ± *f* nowhere vanishes and hence is invertible, so that ω(*t* · 1*<sup>X</sup>* ± *f*) = *t* ± ω(*f*) = 0 by multiplicativity of *f* , i.e., ±ω(*f*) = *t*. Since *f* and hence ω(*f*) is real, this gives |ω(*f*)|≤ *f* ∞. Since ω(*fxn* ) = 0, and similarly for each finite sum in (C.34), we finally obtain

$$|\mathfrak{so}(f)| = \left| \mathfrak{so}(f - \sum\_{n=1}^{N} c\_n \mathfrak{z}\_n f\_{\mathfrak{x}\_n}) \right| \le \left\| f - \sum\_{n=1}^{N} c\_n \mathfrak{z}\_n f\_{\mathfrak{x}\_n} \right\|,\tag{C.35}$$

so letting *N* → ∞ gives ω(*f*) = 0, or *f* ∈ ker(ω). Since *f* is invertible, this implies 1*<sup>X</sup>* ∈ ker(ω) and hence ker(ω) = *C*(*X*), contradicting ω = 0. -

Corollary C.22. *Let X and Y be compact Hausdorff spaces. Then* α(*f*) = *f* ◦ϕ*, i.e.,*

$$
\mathfrak{a} = \mathfrak{p}^\*,
\tag{\text{C.36}}
$$

*establishes a canonical bijective correspondence between unital homomorphisms* α : *C*(*Y*) → *C*(*X*) *(as C\*-algebras) and continuous maps* ϕ : *X* → *Y . In particular, C*(*X*) *and C*(*Y*) *are isomorphic iff X and Y are homeomorphic.*

*Likewise, X and Y are second countable smooth locally compact Hausdorff manifolds, eq.* (C.36) *gives a canonical bijective correspondence between homomorphisms* <sup>α</sup> :*C*∞(*Y*) <sup>→</sup>*C*∞(*X*) *(as commutative algebras) and smooth maps* <sup>ϕ</sup> : *<sup>X</sup>* <sup>→</sup>*Y . In particular, C*∞(*X*) *and C*∞(*Y*) *are isomorphic iff X and Y are diffeomorphic.*

*Proof.* The passage from ϕ to α is obvious. We write ev*<sup>X</sup>* : *X* → Σ(*C*(*X*)) and ev*<sup>Y</sup>* : *Y* → Σ(*C*(*Y*)) for the bijections previously just called ev. Since these maps are invertible by the previous proposition, we may define a map ϕ : *X* → *Y* by

$$\mathfrak{\mathfrak{\mathfrak{\mathfrak{\psi}}}} = \operatorname{ev}\_{Y}^{-1} \circ \mathfrak{\mathfrak{a}}^{\*} \circ \operatorname{ev}\_{X},\tag{\mathsf{C.37}}$$

where α<sup>∗</sup> : Σ(*C*(*X*)) → Σ(*C*(*Y*)) is defined by α∗(ω) = ω ◦α; this lies in Σ(*C*(*Y*)), because α is linear and α(*f g*) = α(*f*)α(*g*). Eq. (C.36) then holds by construction.


We now state *Gelfand duality*, explaining its categorical interpretation in §E.1.

	- *If* ϕ *is the identity, then so is C*(ϕ)*.*
	- *If* ψ : *Y* → *Z is another continuous map, then C*(ϕ ◦ψ) = *C*(ψ) ◦*C*(ϕ)*.*
	- *If* α *is the identity, then so is* Σ(α)*.*
	- *If* β : *B* →*C is another unital homomorphism, then* Σ(β ◦α) = Σ(α) ◦Σ(β)*.*

$$\mathsf{ev}\_{X}: X \xrightarrow{\cong} \Sigma(C(X));\tag{\mathsf{C.38}}$$

$$G\_A: A \xrightarrow{\cong} C(\Sigma(A)),\tag{C.39}$$

*with the following "naturality" properties:*

• *If* Σ ◦*C*(ϕ) : Σ(*C*(*X*)) → Σ(*C*(*Y*)) *is the map induced by* ϕ : *X* → *Y , then*

$$
\Sigma \circ \mathcal{C}(\mathfrak{q}) \circ \text{ev}\_X = \text{ev}\_Y \circ \mathfrak{q};\tag{\text{C.40}}
$$

• *If C*◦Σ(α) : *C*(Σ(*A*)) → *C*(Σ(*B*)) *is the map induced by* α : *A* → *B, then*

$$C \circ \Sigma(\mathfrak{a}) \circ G\_A = G\_B \circ \mathfrak{a}.\tag{C.41}$$

*Proof.* The proof is an assembly of previous results and routine verifications. -

In the language of category theory, Theorem C.23 states that the categories CH of compact Hausdorff spaces (with continuous functions as arrows) and CCA<sup>1</sup> of commutative unital C\*-algebras (with unital homomorphisms as arrows, cf. Definition C.2) are dual (i.e., contravariantly equivalent). In particular, we have an adjunction between the functors *C* : CH → CCA<sup>1</sup> and Σ : CCA<sup>1</sup> → CH.

#### C.4 Gelfand isomorphism and spectral theory

As an example of Gelfand's theory, Theorem 4.3 may be reformulated as follows:

Theorem C.24. *Let H be a Hilbert space, and let a* = *a*<sup>∗</sup> ∈ *B*(*H*)sa*, with associated (commutative) C\*-algebra C*∗(*a*) *generated by a and* 1*H. The Gelfand spectrum* Σ(*C*∗(*a*)) *of C*∗(*a*) *is homeomorphic to* σ(*a*)*, under the mutually inverse maps*

$$
\Sigma(\mathcal{C}^\*(a)) \xrightarrow{\cong} \sigma(a), \;\;\mathfrak{o} \mapsto \mathfrak{o}(a); \tag{C.42}
$$

$$
\sigma(a) \stackrel{\cong}{\longrightarrow} \Sigma(\mathcal{C}^\*(a)), \ \lambda \mapsto a\\\mathfrak{e}: f(a) \mapsto f(\lambda). \tag{C.43}
$$

*In particular, the image of the map* ω → ω(*a*) *from* Σ(*C*∗(*a*)) *to* C *is* σ(*a*)*, and the isomorphism C*∗(*a*) → *C*(σ(*a*))*, f*(*a*) → *f , of Theorem B.94 is obtained by composing the Gelfand transform f*(*a*) → *f* &(*a*) *from C*∗(*a*) *to C*(Σ(*C*∗(*a*))) *with the isomorphism C*(Σ(*C*∗(*a*))) <sup>∼</sup>=−→ *<sup>C</sup>*(σ(*a*)) *obtained by pulling back the map* (C.43)*.*

*Proof.* First, we note that map (C.43) is well defined. Indeed, it follows from (B.289) that the map ωλ : *C*∗(*a*) → C is linear for any λ ∈ σ(*a*), whilst the following computation, which uses (B.290), implies that ωλmultiplicative:

$$a\mathfrak{o}\_{\lambda}(f(a)\mathfrak{g}(a)) = a\mathfrak{o}\_{\lambda}(fg(a)) = (fg)(\lambda) = f(\lambda)\mathfrak{g}(\lambda) = a\mathfrak{o}\_{\lambda}(f(a))a\mathfrak{o}\_{\lambda}(\mathfrak{g}(a)). \tag{C.44}$$

Injectivity of the map λ → ωλ holds because σ(*a*) is Hausdorff, so that *f*(λ ) = *f*(λ) for each *f* ∈*C*(σ(*a*)) implies λ = λ. Surjectivity follows from (B.253), since

$$\sigma\_{C^\*(a)}(f(a)) = \sigma\_{C(\sigma(a))}(f) = \text{Ran}(f),\tag{C.45}$$

where we used invariance of the spectrum under isomorphisms. Consider the function *f*(*x*) = *x*, so that *f*(*a*) = *a*. It follows from (C.43) that ωλ (*a*) = λ. Conversely, using the same function *f* , for given ω ∈ Σ(*C*∗(*a*)) we find ωω(*a*) = ω, so that the maps in (C.42) - (C.43) are mutually inverse. It is clear from (C.42) - (C.43) dat ωλ*<sup>i</sup>* → ωλ in the Gelfand topology on Σ(*C*∗(*a*)) (which is the topology of pointwise convergence) iff *f*(λ*i*) → *f*(λ) for each *f* ∈ *C*(σ(*a*)), which is the case iff λ*<sup>i</sup>* → λ on σ(*a*). Hence both of our maps Σ(*C*∗(*a*)) ↔ σ(*a*) are continuous.

The final claim is a definition chase, using the computation

$$\dot{f}(a)(a\mathbb{\hat{\lambda}}) = a\mathbb{\hat{\lambda}}(f(a)) = f(\mathbb{\hat{\lambda}}).\tag{7}$$

If dim(*H*) < ∞, one may replace this proof by using the fact that σ(*a*) consists of the eigenvalues of *a*. If *p* is a polynomial, then ω ∈ Σ(*C*∗(*a*)) must satisfy ω(*p*(*a*)) = *p*(ω(*a*)). The characteristic polynomial *pc* of *a*, i.e., *pc*(*x*) = ∏*<sup>n</sup> <sup>i</sup>*=1(λ*i*−*x*), where the λ*<sup>i</sup>* are the *n* = dim(*H*) eigenvalues of *a* (including repetitions), satisfies *pc*(*a*) = 0, so that ω(*pc*(*a*)) = 0, i.e., ∏*<sup>n</sup> <sup>i</sup>*=1(λ*<sup>i</sup>* − ω(*a*)) = 0, and hence ω(*a*) = λ*<sup>i</sup>* for some *i*, or ω(*a*) ∈ σ(*a*). Thus (C.42) is well defined. In the opposite direction, eqs. (A.53) - (A.55) show that (C.43) is also well defined, in that indeed ωλ ∈ Σ(*C*∗(*a*)).

The construction of *C*∗(*a*) as a C\*-algebra within *B*(*H*) may trivially be generalized to arbitrary unital C\*-algebras *A*, i.e., if *a* ∈ *A*, we define *C*∗(*a*) as the C\* algebra generated ( within *A*) by *a* and the unit 1*A*. If *a* = *a*∗, then *C*∗(*a*) still equals the norm-closure of the algebra of all polynomials in *a*, and hence *C*∗(*a*) is once again commutative. Defining the spectrum σ(*a*) as in Definition B.81, we then have the following generalization of Theorem C.24:

Theorem C.25. *Let A be a unital C\*-algebra and let a*<sup>∗</sup> = *a* ∈ *A. Then*

$$
\Sigma(\mathbb{C}^\*(a)) \cong \sigma(a), \ \mathfrak{o} \leftrightarrow \mathfrak{o}(a); \tag{C.46}
$$

$$C^\*(a) \cong C(\sigma(a)), \ f(a) \leftrightarrow f,\tag{C.47}$$

*as spaces and as (commutative) C\*-algebras, respectively. Under the Gelfand isomorphism* (C.47)*, the Gelfand transform a of a* ˆ ∈*C*∗(*a*) *is the identity* idσ(*a*) : λ → λ*, whereas the Gelfand transform* 1 0*<sup>A</sup> of* 1*<sup>A</sup>* ∈ *C*∗(*a*) *is the unit* 1σ(*a*) : λ → 1*.*

This *continuous functional calculus* may be proved in exactly the same way as Theorems B.94 and C.24, with *B*(*H*) *A*. However, these proofs did not invoke Gelfand's Theorem (but rather derived it in the special case at hand), so it may give additional insight in the situation if we reprove Theorem C.25 from Theorem C.8.

*Proof.* We now *assume* the isomorphism *C*∗(*a*) ∼= *C*(Σ(*C*∗(*a*))) via the Gelfand transform. According to (C.22) and (B.253), which imply σ(*a*ˆ) = ran(*a*ˆ), the function ˆ*a* : Σ(*C*∗(*a*)) → C is surjective onto the spectrum σ(*a*) ⊂ C. We now prove injectivity. If ω1,ω<sup>2</sup> ∈ Σ(*C*∗(*a*)) and ω1(*a*) = ω2(*a*), then, for all *n* ∈ N, we have

$$
\mathfrak{o}\_{\mathsf{l}}(a^n) = \mathfrak{o}\_{\mathsf{l}}(a)^n = \mathfrak{o}\_{\mathsf{2}}(a)^n = \mathfrak{o}\_{\mathsf{2}}(a^n), \tag{C.48}
$$

Since also ω1(1*A*) = ω2(1*A*) = 1, we conclude by linearity that ω<sup>1</sup> = ω<sup>2</sup> on all polynomials in *a*. By continuity (cf. Lemma C.9) this implies that ω<sup>1</sup> = ω2, since by definition the linear span of all polynomials is dense in *C*∗(*a*). Using (C.12), we have therefore proved that ˆ*a*(ω1) = *a*ˆ(ω2) implies ω<sup>1</sup> = ω2, i.e., ˆ*a* is injective.

Since ˆ*a* ∈ *C*(Σ(*C*∗(*a*))) by Theorem C.8, ˆ*a* is continuous. To prove continuity of the inverse, recall that ˆ*a* : Σ(*C*∗(*a*)) → σ(*a*) is the map ω → ω(*a*), so that for λ ∈ <sup>σ</sup>(*a*), the functional ˆ*a*−1(λ) <sup>∈</sup> <sup>Σ</sup>(*C*∗(*a*)) maps *<sup>a</sup>* to <sup>λ</sup>. By multiplicativity, ˆ*a*−1(λ) then maps *an* to λ*n*. Hence ny linearity and (C.14), for polynomials *p* in *a* one has

$$\hat{a}^{-1}(\lambda) : p(a) \mapsto p(\lambda). \tag{C.49}$$

Since polynomials are continuous, if λ*<sup>n</sup>* → λ in σ(*a*), then *p*(λ*n*) → *p*(λ), so

$$(\hat{a}^{-1}(\lambda\_n))(p) \to (\hat{a}^{-1}(\lambda))(p). \tag{C.50}$$

Since such polynomials *p*(*a*) are dense in *C*∗(*a*) by definition, and functionals in Σ(*C*∗(*a*)), being continuous, are therefore determined by their values on polynomials, we conclude that ˆ*a*−1(λ*n*) <sup>→</sup> *<sup>a</sup>*<sup>ˆ</sup> <sup>−</sup>1(λ) pointwise. Since the Gelfand topology is the topology of pointwise convergence, we conclude that ˆ*a*−<sup>1</sup> is continuous, so that *a*ˆ is a homeomorphism. This proves (C.46).

Finally, for compact Hausdorff spaces *X* and *Y*, a homeomorphism ϕ : *X* → *Y* induces an isomorphism ϕ<sup>∗</sup> :*C*(*Y*) →*C*(*X*) of C\*-algebras, where ϕ(*f*) = *f* ◦ϕ (cf. §C.3). Theorem C.8 and (C.46) give (C.47). Unfolding the latter isomorphism gives

$$C^\*(a) \stackrel{\text{GT}}{\longrightarrow} C(\Sigma(C^\*(a))) \stackrel{(\hat{a}^{-1})^\*}{\longrightarrow} C(\sigma(a)),\tag{C.51}$$

where GT is the Gelfand transform and (*a*ˆ <sup>−</sup>1)<sup>∗</sup> is the pullback of the homeomorphism ˆ*a*−<sup>1</sup> : <sup>σ</sup>(*a*) <sup>→</sup> <sup>Σ</sup>(*C*∗(*a*)), as in <sup>ϕ</sup><sup>∗</sup> above. Following these arrows and using (C.49), one obtains the last claim. -

Corollary C.26. *Let A be a unital C\*-algebra and let a*<sup>∗</sup> = *a* ∈ *A, with spectrum* σ(*a*)*. For each selfadjoint element a* ∈ *A and each f* ∈*C*(σ(*a*))*, there is an operator f*(*a*) ∈ *A, which is the obvious expression when f is a polynomial (and in general is given via the uniform approximation of f by polynomials), such that*

$$\|f(a)\| = \|f\|\_{\ast};\tag{C.52}$$

$$
\sigma(f(a)) = f(\sigma(a)). \tag{C.53}
$$

*Eq.* (C.53) *is called the* spectral mapping property*. Furthermore, the norm and spectrum of a as an element of A coincide with the norm and spectrum of a in C*∗(*a*)*.*

*Proof.* We write (C.51) in the opposite direction, i.e.,

$$C(\sigma(a)) \stackrel{(\mathfrak{o}\mapsto\mathfrak{o}(a))^{\*}}{\longrightarrow} C(\Sigma(C^{\*}(a))) \stackrel{\scriptstyle \mathfrak{d}\mapsto\mathfrak{q}}{\longrightarrow} C^{\*}(a). \tag{C.54}$$

Indeed, if ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(Σ(*C*∗(*a*))) is the image of *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(σ(*a*)) under the first arrow, then ˜*f*(ω) = *f*(ω(*a*)), and the second arrow says that *f* &(*a*) = ˜*<sup>f</sup>* . Together these give *f*(ω(*a*)) = ω(*f*(*a*)), which by multiplicativity, linearity, and (C.14), is the case for polynomials *f* = *p*; the general case follows from the polynomial case by continuity.

Eq. (C.52) follows from (C.18) and the fact that also the first arrow in (C.54) is an isometry, and (C.53) follows from (C.22), with with *a f*(*a*).

To close, take *f* = idσ(*a*); then (C.52) gives *a<sup>A</sup>* = *r*(*a*), cf. (B.257), whilst (C.18) gives *aC*∗(*a*) = *r*(*a*), too. Finally, (C.47) and (B.253) show that the spectrum of *a* in *C*∗(*a*) is σ(*a*), which by definition is its spectrum in *A*. -

Corollary C.27. *If a*<sup>∗</sup> = *a, then* σ(*a*) ⊂ R*.*

By Corollary C.26, we may take the spectrum of *a* in *C*∗(*a*). By Lemma C.11, the Gelfand transform ˆ*a* is real-valued. Then use the last part of Theorem C.25. -

Corollary C.28. *The norm in a C\*-algebra is unique (given all other structure).*

Using (B.257) for *a* = *a*∗, and then (C.2), for arbitrary *a* ∈ *A* we find

$$||a|| = \sqrt{r(a^\*a)}.\tag{C.55}$$

Since the spectrum (and hence the spectral radius *r*) is determined by the algebraic structure, (C.55) shows that the norm is determined by the algebraic structure. -

#### C.5 C\*-algebras without unit: general theory

In classical physics, non-compact phase spaces are described by commutative C\* algebras *without unit*. Proper ideals in C\*-algebras necessarily lack a unit, too. To set the stage, we first assume that *A* is a Banach algebra, and form the vector space

$$
\dot{A} = A \oplus \mathbb{C},
\tag{C.56}
$$

and turn this into an algebra in the obvious way, i.e., by means of

$$(a + \lambda \cdot 1\_{\dot{A}})(b + \mu \cdot 1\_{\dot{A}}) = ab + \lambda b + \mu a + \lambda \mu \cdot 1\_{\dot{A}},\tag{C.57}$$

where we have written *a*+λ · 1*A*˙ for (*a*,λ), etc. This turns the number 1 in C into a unit 1*A*˙ for *<sup>A</sup>*˙, and this is the point: *<sup>A</sup>*˙ is unital, even if *<sup>A</sup>* lacks a unit. Defining

$$\|\|a + \lambda \cdot 1\_{\dot{A}}\|\| = \|a\|\| + |\lambda|, \tag{C.58}$$

we also have a norm on *<sup>A</sup>*˙, with 1*A*˙ <sup>=</sup> 1. Using (C.1), (C.57), and (C.58), we have

$$\begin{aligned} \left\|(a+\lambda \cdot 1\_{\hat{A}})(b+\mu \cdot 1\_{\hat{A}})\right\| &\leq \left\|a\right\| \left\|b\right\| + \left|\lambda\right\| \left\|b\right\| + \left|\mu\right\| \left\|a\right\| + \left|\lambda\right\| \left\|\mu\right\| \\ &= \left\|a+\lambda \cdot 1\_{\hat{A}}\right\| \left\|b+\mu \cdot 1\_{\hat{A}}\right\|, \end{aligned}$$

so that *<sup>A</sup>*˙ is a *Banach algebra with unit*. Since by (C.58) the norm of *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>* in *<sup>A</sup>* coincides with the norm of *a*+0 · 1*A*˙ in *A*⊕C, we have shown the following:

Proposition C.29. *For every Banach algebra (with or without unit) there exists a unital Banach algebra A, called the* ˙ unitization *of A, and an isometric (hence injective) morphism A* <sup>→</sup> *A, such that* ˙ *<sup>A</sup>*˙/*<sup>A</sup>* <sup>∼</sup><sup>=</sup> <sup>C</sup>*.*

If *A* is a C\*-algebra, (C.58) fails to be a C\*-norm with respect to the involution

$$(a + \mathcal{X} \cdot \mathbf{1}\_{\dot{A}})^\* = a^\* + \overline{\mathcal{X}} \cdot \mathbf{1}\_{\dot{A}},\tag{C.59}$$

since (C.2) is not satisfied. Instead, the correct norm in which *A* ⊕ C is a unital C\*-algebra is the one borrowed from *B*(*A*), i.e., the Banach space of bounded linear maps from *A* to *A* (regarded as a Banach space), relying on an embedding *A* ⊂ *B*(*A*):

Proposition C.30. *Let A be a C\*-algebra (with or without unit).*

*1. The map L* : *A* → *B*(*A*) *, a* → *La, given by*

$$L\_a(b) = ab \tag{C.60}$$

*establishes an isometric isomorphism between A and L*(*A*) ⊂ *B*(*A*)*. 2. When A has no unit, define a norm on <sup>A</sup>*˙ <sup>=</sup> *<sup>A</sup>*⊕<sup>C</sup> *by*

$$\|\|a + \lambda \cdot 1\_{\dot{A}}\|\| = \|L\_a + \lambda \cdot 1\_{B(A)}\|\|,\tag{C.61}$$

*where the right-hand side uses the operator norm in B*(*A*)*. With the operations (C.57) and (C.59), the norm (C.61) turns A into a C\*-algebra with unit.* ˙

*Proof.* By (C.1) we have *Lab* = *ab*≤*a b* for all *b*, so that *La*≤*a*. On the other hand, using (C.2) and (A.22), assuming *a* = 0, we can write

$$||a|| = ||aa^\*||/||a|| = \left||L\_a \frac{a^\*}{||a||}\right|| \le ||L\_a||. \tag{C.62}$$

Hence

$$\|L\_a\| = \|a\|. \tag{C.63}$$

Being isometric, the map *L* must be injective; it is clearly a homomorphism, so that we have proved the first claim of the proposition.

It is clear from (C.57) and (C.59) that the map *a* + λ · 1*A*˙ → *La* + λ · 1*B*(*H*) is a homomorphism. Hence the norm (C.61) satisfies (C.1), for this is satisfied in the Banach algebra *B*(*A*). In order to prove that the norm (C.61) satisfies (C.2), we note that if an involution on a Banach algebra *<sup>A</sup>* satisfies *a*<sup>2</sup> ≤ *a*∗*a*, then *<sup>A</sup>* is a C\*-algebra, because substituting *<sup>a</sup> <sup>a</sup>*<sup>∗</sup> gives *a*∗<sup>2</sup> ≤ *aa*∗≤*aa*∗, i.e., *a*∗≤*a*, so that *a*∗*a*≤*a*<sup>2</sup> and hence *a*<sup>2</sup> <sup>=</sup> *a*∗*a*.

Thus it suffices to show that for each *a* ∈ *A* and λ ∈ C we have

$$\|\|L\_a + \lambda \cdot 1\_{\dot{A}}\|\|^2 \le \|(L\_a + \lambda \cdot 1\_{\dot{A}})^\*(L\_a + \lambda \cdot 1\_{\dot{A}})\|\|.\tag{C.64}$$

To prove (C.64), we note that by definition of the norm in *B*(*A*), for given *T* ∈ *<sup>B</sup>*(*A*) and <sup>ε</sup> <sup>&</sup>gt; 0, there exists a *<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>*, with *b* <sup>=</sup> 1, such that *T*<sup>2</sup> <sup>−</sup><sup>ε</sup> ≤ *T*(*b*)2. Applying this with *T* = *La* +λ · 1*A*˙, we infer that for every ε > 0 one has

$$\|\|L\_a + \lambda \cdot 1\_{\dot{A}}\|\|^2 - \varepsilon \le \|(L\_a + \lambda \cdot 1\_{\dot{A}})b\|\|^2 = \|ab + \lambda b\|\|^2 = \|(ab + \lambda b)^\*(ab + \lambda b)\|\|.$$

Here we used (C.2) in *A*. Using (C.60), the right-hand side may be rearranged as

$$\|\|L\_{b^\*}L\_{a^\* + \overline{\lambda} \cdot 1\_{\hat{A}}}L\_{a + \lambda \cdot 1\_{\hat{A}}}b\| \le \|\|L\_{b^\*}\|\| \|(L\_a + \lambda \cdot 1\_{\hat{A}})^\*(L\_a + \lambda \cdot 1\_{\hat{A}})\|\|b\|.\tag{C.65}$$

Since *Lb*<sup>∗</sup> = *b*∗ = *b* = 1 by (C.63) and (A.22), and *b* = 1 also in the last term, the inequality (C.64) follows by letting ε → 0. -

Hence the C\*-algebraic version of Theorem C.29, slightly supplemented, is:

Theorem C.31. *For every C\*-algebra A, there is a unique unital C\*-algebra A and* ˙ *an isometric (hence injective) morphism A* <sup>→</sup> *A, such that* ˙ *<sup>A</sup>*˙/*<sup>A</sup>* <sup>∼</sup><sup>=</sup> <sup>C</sup>*. Moreover, any homomorphism* <sup>α</sup> : *<sup>A</sup>* <sup>→</sup> *B extends to a* unital *homomorphism* <sup>α</sup>˙ : *<sup>A</sup>*˙ <sup>→</sup> *B by* ˙

$$
\dot{\mathfrak{a}}(a+\mathfrak{A}\cdot \mathbf{1}\_{\dot{A}}) = \mathfrak{a}(a) + \mathfrak{A}\cdot \mathbf{1}\_{\dot{B}}.\tag{\mathsf{C.66}}
$$

*Proof.* Uniqueness of *A*˙ follows from Corollary C.28; the rest is obvious. -

This is very important, if only for the following reason:

Definition C.32. *Let A be a C\*-algebra without unit. Then the spectrum* σ(*a*) *of any a* <sup>∈</sup> *A consists of all* <sup>λ</sup> <sup>∈</sup> <sup>C</sup> *for which the operator a*−<sup>λ</sup> *is* not *invertible in A.*˙

Proposition C.33. *If A has no unit, then* 0 ∈ σ(*a*) *for any a* ∈ *A.*

*Proof.* If 0 <sup>∈</sup>/ <sup>σ</sup>(*a*), i.e., if *<sup>a</sup>* were invertible in *<sup>A</sup>*˙, then *<sup>a</sup>*−<sup>1</sup> <sup>=</sup> *<sup>b</sup>* <sup>+</sup> <sup>μ</sup> · <sup>1</sup>*A*˙, for some *<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>* and <sup>μ</sup> <sup>∈</sup> <sup>C</sup>. Then 1*A*˙ <sup>=</sup> *aa*−<sup>1</sup> <sup>=</sup> *ab*<sup>+</sup> <sup>μ</sup>*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*. This is a contradiction. -

The spectral theory of compact operators provides a nice illustration of this proposition: see Theorem B.136.4. At the commutative end of the operator-algebraic world, we have the obvious fact that if *X* is not compact, no *f* ∈ *C*0(*X*) is invertible.

The construction of *A*˙ through (C.56), (C.57), (C.59), and (C.61) also works *verbatim* if *A* already has a unit 1*A*, in which case the spectrum σ(*a*) of *a* ∈ *A* may be compared with the spectrum <sup>σ</sup>(*a*˙) of its image ˙*<sup>a</sup>* <sup>≡</sup> (*a*,0) in *<sup>A</sup>*˙.

Lemma C.34. *Let A be a C\*-algebra with unit, embedded in A. For any a* ˙ <sup>∈</sup> *A, the spectrum* <sup>σ</sup>(*a*) *in A is related to the spectrum* <sup>σ</sup>(*a*˙) *of its image <sup>a</sup>*˙ <sup>≡</sup> (*a*,0) *in A by* ˙

$$
\sigma(\dot{a}) = \sigma(a) \cup \{0\}. \tag{C.67}
$$

This will be important for the proof of the fundamental Theorem C.62 below.

*Proof.* Suppose 0 <sup>=</sup> *<sup>z</sup>* <sup>∈</sup> <sup>ρ</sup>(*a*), so that *<sup>b</sup>* <sup>≡</sup> (*a*−*z*· <sup>1</sup>*A*)−<sup>1</sup> exists and satisfies

$$ab - \underline{z}b = ba - \underline{z}b = 1\_A. \tag{C.68}$$

Then *<sup>b</sup>* <sup>=</sup> *<sup>b</sup>* <sup>+</sup> *<sup>z</sup>*−<sup>1</sup> · (1*<sup>A</sup>* <sup>−</sup> <sup>1</sup>*A*˙) satisfies *ab* <sup>−</sup> *zb* <sup>=</sup> *<sup>b</sup> a* − *zb* = 1*A*˙, so that *b* = (*<sup>a</sup>* <sup>−</sup> *<sup>z</sup>*· <sup>1</sup>*A*˙)−<sup>1</sup> exists in *<sup>A</sup>*˙, and hence *<sup>z</sup>* <sup>∈</sup> <sup>ρ</sup>(*a*˙). Conversely, if 0 <sup>=</sup> *<sup>z</sup>* <sup>∈</sup> <sup>ρ</sup>(*a*˙) with corresponding *<sup>b</sup>* as before, then we first form *<sup>b</sup>* <sup>=</sup> *<sup>b</sup>* <sup>−</sup> *<sup>z</sup>*−<sup>1</sup> ·(1*<sup>A</sup>* <sup>−</sup> <sup>1</sup>*A*˙), which satisfies (C.68) but may not lie in *A*. If *b* = *b* +β · 1*A*˙, where *b* ∈ *A* and β ∈ C, this is remedied by redefining *b* = *b*+β ·(1*<sup>A</sup>* −1*A*˙), which lies in *A* and is inverse to *a* − *z*· 1*A*. Furthermore, by the proof of Proposition C.33 with *a a*˙, we always have 0 ∈ σ(*a*˙). If 0 ∈ σ(*a*), then the above argument gives σ(*a*˙) = σ(*a*), which is a special case of (C.67). If 0 ∈/ σ(*a*), then (C.67) follows as it stands. -

To close this section, we intoduce the technique of approximate units, which will play a decisive role in the theory of ideals in C\*-algebras (see §C.9). Let us first give an example. For any noncompact space *X*, the C\*-algebra *C*0(*X*) has no unit (the unit would be 1*<sup>X</sup>* , which does not vanish at infinity because it is constant). There is a certain substitute for the absentee unit, though. Taking *X* = R for simplicity, and pick a sequence of functions 1*n*, *n* ∈ N, that take the value 1 on [−*n*,*n*] and vanish for |*x*| > *n*+1. It is clear that one does not have 1*<sup>n</sup>* → 1<sup>R</sup> in the sup-norm, but instead one has lim*n*→<sup>∞</sup> 1*<sup>n</sup> f* − *f* <sup>∞</sup> = 0 for all *f* ∈ *C*0(R). More generally, one puts:

Definition C.35. *An* approximate unit *in a non-unital C\*-algebra A indexed by some directed set* Λ *is a family* {1<sup>λ</sup> }λ∈<sup>Λ</sup> *of selfadjoint elements of A, such that*

$$\|\mathbf{1}\_{\lambda}\| \le 1,\tag{\text{C.69}}$$

*and, for each a* ∈ *A,*

$$\lim\_{\lambda \to \infty} ||1\_{\lambda}a - a|| = \lim\_{\lambda \to \infty} ||a1\_{\lambda} - a|| = 0. \tag{C.70}$$

Here the limit is meant in the sense of convergence of the nets λ → 1<sup>λ</sup> *a*−*a* and λ → *a*1<sup>λ</sup> −*a* in R indexed by Λ (i.e., for each open neighbourhood *U* of 0 in R there is some λ*<sup>U</sup>* ∈ Λ such that 1<sup>λ</sup> *a*−*a* ∈ *U* for all λ ≥ λ*<sup>U</sup>* , etc.).

Proposition C.36. *Every non-unital C\*-algebra A has an approximate unit* {1<sup>λ</sup> }λ∈<sup>Λ</sup> *. When A is separable, one may choose the directed set* Λ *countable (i.e.* Λ = N*).*

*Proof.* One takes Λ to be the set of all finite subsets of *A* (or, if *A* is separable, from a countable dense subset of *A*), partially ordered by inclusion. Hence λ ∈ Λ is of the form λ = {*a*1,...,*an*}, from which we build the element *b*<sup>λ</sup> = ∑*<sup>i</sup> a*<sup>∗</sup> *<sup>i</sup> ai*. Clearly *b*<sup>λ</sup> is selfadjoint, and according to Theorem C.52 and Proposition C.51 one has <sup>σ</sup>(*b*<sup>λ</sup> ) <sup>⊂</sup> <sup>R</sup>+, so that *<sup>n</sup>*−11*A*˙ <sup>+</sup>*b*<sup>λ</sup> is invertible in the unitization *<sup>A</sup>*˙ of *<sup>A</sup>*. Take

$$1\_{\lambda} = b\_{\lambda} \left( n^{-1} 1\_{\dot{A}} + b\_{\lambda} \right)^{-1}. \tag{C.71}$$

Since *b*∗ <sup>λ</sup> = *b*<sup>∗</sup> <sup>λ</sup> and *<sup>b</sup>*<sup>λ</sup> commutes with functions of itself like (*n*−11*A*˙ <sup>+</sup>*b*<sup>λ</sup> )−1, one has 1∗ <sup>λ</sup> <sup>=</sup> <sup>1</sup><sup>λ</sup> . Although (*n*−11*A*˙ <sup>+</sup> *<sup>b</sup>*<sup>λ</sup> )−<sup>1</sup> is computed in *<sup>A</sup>*˙, so that it is of the form *c* + μ1*A*˙ (for some *c* ∈ *A* and μ ∈ C), one has 1<sup>λ</sup> = *b*<sup>λ</sup> *c* + μ*b*<sup>λ</sup> , which lies in *A*. Using the continuous functional calculus (i.e. Theorem C.25) with *f*(*t*) = *t*/(*n*+*t*) on *b*<sup>λ</sup> , one sees from (C.53) and the positivity of *b*<sup>λ</sup> that σ(1<sup>λ</sup> ) ⊂ [0,1]. This implies (C.69) because of (B.257). Putting *ci* = 1<sup>λ</sup> *ai* −*ai*, a simple computation shows that

$$\sum\_{i} c\_{i} c\_{i}^{\*} = n^{-2} b\_{\lambda} (n^{-1} 1\_{\hat{A}} + b\_{\lambda})^{-2}. \tag{C.72}$$

We now apply (C.52) with *<sup>a</sup> <sup>b</sup>*<sup>λ</sup> and *<sup>f</sup>*(*t*) = *<sup>n</sup>*−2*t*(*n*−<sup>1</sup> <sup>+</sup>*t*)−2. Since *<sup>f</sup>* <sup>≥</sup> 0, and *<sup>f</sup>* assumes its maximum at *<sup>t</sup>* <sup>=</sup> <sup>1</sup>/*n*, one has sup*t*∈R<sup>+</sup> <sup>|</sup> *<sup>f</sup>*(*t*)<sup>|</sup> <sup>=</sup> <sup>1</sup>/4*n*. As <sup>σ</sup>(*b*<sup>λ</sup> ) <sup>⊂</sup> <sup>R</sup>+, it follows that *f* <sup>∞</sup> ≤ 1/4*n*. Therefore, by (C.52) we have

$$||n^{-2}b\_{\lambda} \left(n^{-1}\mathbf{1}\_{\dot{A}} + b\_{\lambda}\right)^{-2}|| \leq 1/4n,\tag{C.73}$$

so that ∑*<sup>i</sup> cic*<sup>∗</sup> *<sup>i</sup>* ≤ 1/4*n* by (C.72). By Lemma C.37 below this implies that *cic*<sup>∗</sup> *<sup>i</sup>* ≤ 1/4*n* for each *i* = 1,...,*n*. Since any *a* ∈ *A* sits in some directed subset of Λ with *n* → ∞, eq. (C.2) implies

$$\lim\_{\lambda \to \infty} \left\| 1\_{\lambda}a - a \right\|^{2} = \lim\_{\lambda \to \infty} \left\| (1\_{\lambda}a - a)^{\*}1\_{\lambda}a - a \right\| = \lim\_{\lambda \to \infty} \left\| c\_{i}^{\*}c\_{i} \right\| = 0. \tag{C.74}$$

The other equality in (C.70) follows analogously. -

In this proof we used the following lemma.

Lemma C.37. *If a*,*<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> *and a*+*b* ≤ *k, then a* ≤ *k.*

*Proof.* We first pass to the unitization *<sup>A</sup>*˙ of *<sup>A</sup>*. By (C.83) we have *<sup>a</sup>* <sup>+</sup> *<sup>b</sup>* <sup>≤</sup> *<sup>k</sup>*1*A*˙, hence 0 ≤ *a* ≤ *k*1*A*˙ −*b* by linearity of ≤ (see Proposition C.51 below), which also implies that *k*1*A*˙ −*b* ≤ *k*1*A*˙, as 0 ≤ *b*. Hence, using −*k*1*A*˙ ≤ 0 (since *k* ≥ 0), we obtain −*k*1*A*˙ ≤ *a* ≤ *k*1*A*˙, from which *a* ≤ *k* by (C.84)). -

#### C.6 C\*-algebras without unit: commutative case

We still owe the reader a proof of Theorems C.8 and C.23 for the nonunital case.

In the commutative case, the unitization procedure has a simple topological meaning, which illustrates the general principle that the use of commutative C\* algebras often allows one to trade topological properties for algebraic ones.

The *one-point compactification X*˙ of a non-compact locally compact topological space *<sup>X</sup>* is the set *<sup>X</sup>*˙ <sup>=</sup> *<sup>X</sup>* <sup>∪</sup>∞, topologized by the open sets in *<sup>X</sup>* plus those subsets of *<sup>X</sup>* <sup>∪</sup> <sup>∞</sup> whose complement is compact in *<sup>X</sup>*. The injection *<sup>i</sup>* : *<sup>X</sup>* <sup>→</sup> *<sup>X</sup>*˙ is continuous, and any continuous function *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*X*) extends uniquely to a function ˙*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*˙) satisfying ˙*f*(∞) = 0. The space *X*˙ is the solution (unique up to homeomorphism) of a universal problem: if ϕ : *X* → *Y* is a map between locally compact Hausdorff spaces such that *Y*\ *f*(*X*) is a point and *f* is a homeomorphism onto its image, then there is a unique homeomorphism <sup>ψ</sup> : *<sup>X</sup>*˙ <sup>→</sup> *<sup>Y</sup>* such that <sup>ϕ</sup> <sup>=</sup> <sup>ψ</sup> ◦ *<sup>i</sup>*. All this is true even when *X* is compact, in which case ∞ is an isolated point of *X*˙.

The unitization of *C*0(*X*) corresponds to the one-point compactification of *X*:

Lemma C.38. *Let X be a locally compact Hausdorff space. Then C*˙ <sup>0</sup>(*X*) ∼= *C*(*X*˙)*.*

*Proof.* The map *cX* :*C*˙ <sup>0</sup>(*X*) <sup>→</sup>*C*(*X*˙) given by *cX* (*<sup>f</sup>* <sup>+</sup><sup>λ</sup> ·1*A*˙) = ˙*<sup>f</sup>* <sup>+</sup><sup>λ</sup> ·1*<sup>X</sup>* is obviously an injective homomorphism. To prove surjectivity, note that any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*˙) assumes the form *<sup>f</sup>* <sup>=</sup> ˙*<sup>f</sup>* <sup>+</sup> *<sup>f</sup>*(∞)· <sup>1</sup>*X*˙ , where ˙*<sup>f</sup>* <sup>=</sup> *<sup>f</sup>* <sup>−</sup> *<sup>f</sup>*(∞)· <sup>1</sup>*X*˙ is such that ˙*f*|*<sup>X</sup>* <sup>∈</sup> *<sup>C</sup>*0(*X*). Thus our map is an algebraic isomorphism, which by Theorem C.62 is also isometric. -

Lemma C.39. *Let A be a commutative C\*-algebra, with unitization A. Then the fol-* ˙ *lowing map sA* : <sup>Σ</sup>˙(*A*) <sup>→</sup> <sup>Σ</sup>(*A*˙) *between their Gelfand spectra is a homeomorphism:*

*1. Each* <sup>ω</sup> <sup>∈</sup> <sup>Σ</sup>(*A*) *extends to a character* <sup>ω</sup>˙ <sup>≡</sup> *sA*(ω) *on A by* ˙

$$
\dot{\mathfrak{o}}(a + \mathfrak{X}1\_{\dot{A}}) = \mathfrak{o}(a) + \mathfrak{X}.\tag{C.75}
$$

*2. The following functional* <sup>ω</sup><sup>∞</sup> <sup>≡</sup> *sA*(∞) *on A is a character of* ˙ *A:*˙

$$a\mathfrak{o}\_{\simeq}(a+\lambda 1\_{\mathring{A}})=\lambda.\tag{C.76}$$

*3. There are no other characters on A (i.e. except* ˙ <sup>ω</sup><sup>∞</sup> *and* <sup>ω</sup>˙ *, where* <sup>ω</sup> <sup>∈</sup> <sup>Σ</sup>(*A*)*).*

*Proof.* Only the third part is nontrivial: any <sup>ω</sup> <sup>∈</sup> <sup>Σ</sup>(*A*˙) restricts to <sup>Σ</sup>(*A*); if this restriction is zero, then ω = ω∞, and if not, we have ω = ω˙ with ω = ω |Σ(*A*) . -

We are now in a position to prove Theorem C.8 also in the nonunital case. Applying the unital case of Theorem C.8 to *A*˙ and using Lemma C.39, one finds

$$A \oplus \mathbb{C} = \dot{A} \cong \mathbb{C}(\Sigma(\dot{A})) \cong \mathbb{C}(\dot{\Sigma}(A)) \cong \dot{\mathbb{C}}\_0(\Sigma(A)) = \mathbb{C}\_0(\Sigma(A)) \oplus \mathbb{C}.\tag{C.77}$$

Keeping track of all isomorphisms, the initial C is duly mapped to the final C (as befits an isomorphism of unital C\*-algebras), and *A* is mapped to *C*0(Σ(*A*)). -

Next, we return to Theorem C.23. If *X* fails to be compact, the difficulty arises that a map ϕ : *X* → *Y* does not, in general, pull back to a morphism ϕ<sup>∗</sup> : *C*0(*Y*) → *C*0(*X*). For example, with *Y* equal to a point, any *f* ∈ *C*(*Y*) ∼= C pulls back to a constant function on *X*, which does not vanish at infinity. Hence some restriction is necessary on the class of allowed maps between locally compact Hausdorff spaces.

Definition C.40. *A map* ϕ : *X* → *Y between locally compact Hausdorff spaces is* proper *when* <sup>ϕ</sup>−1(*K*) *is compact for any compact set K* <sup>⊂</sup> *Y .*

Without proof (since this is basic topology), we list some properties of proper maps.

Lemma C.41. *Let* ϕ : *X* → *Y be a map between locally compact Hausdorff spaces.*


The algebraic (or "noncommutative") counterpart of a proper map is as follows.

Definition C.42. *A homomorphism* α : *A* → *B between C\*-algebras is called* nondegenerate *when* α(*A*)*B*− = *B, in other words, if* α(*A*)*B (i.e., the linear span of all expressions of the form* α(*a*)*b, a* ∈ *A, b* ∈ *B) is dense in B.*

For example, any unital homomorphism between unital C\*-algebras is trivially nondegenerate, and conversely, a nondegenerate homomorphism α : *A* → *B* between unital C\*-algebras is automatically unital. To see this, it follows from (C.4) - (C.5) that *<sup>e</sup>* <sup>=</sup> <sup>α</sup>(1*A*) is a projection in *<sup>B</sup>* (i.e., *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*), so that <sup>α</sup>(*A*)*<sup>B</sup>* <sup>⊆</sup> *eB*. Since *B* = *eB* ⊕ (1*<sup>B</sup>* − *e*)*B* as a vector space, α(*A*)*B* and hence *eB* can only be dense in *B* when *e* = 1*B*. Similarly, using an approximate unit in *B* it is easy to show that nondegenerate homomorphisms *A* → *B* cannot exist if *A* is unital but *B* is not.

This is a "noncommutative" version of the third part of Lemma C.41 above.

Lemma C.43. *Let* ϕ : *X* →*Y be a continuous proper map between locally compact Hausdorff spaces. If f* ∈*C*0(*Y*)*, then f* ◦ϕ ∈*C*0(*X*)*, and the corresponding pullback* ϕ<sup>∗</sup> : *C*0(*Y*) → *C*0(*X*) *is a nondegenerate homomorphism of C\*-algebras.*

*Proof.* Let *f* ∈ *C*0(*Y*) and ε > 0, giving a compact *K* ⊂ *Y* such that | *f*(*y*)| < ε for each *<sup>y</sup>* <sup>∈</sup>/ *<sup>K</sup>*. Then *<sup>K</sup>* <sup>=</sup> <sup>ϕ</sup>−1(*K*) <sup>⊂</sup> *<sup>X</sup>* is compact, and <sup>|</sup>ϕ<sup>∗</sup> *<sup>f</sup>*(*x*)<sup>|</sup> <sup>&</sup>lt; <sup>ε</sup> for each *<sup>x</sup>* <sup>∈</sup>/ *<sup>K</sup>* .

For nondegeneracy, take *g* ∈ *C*0(*X*) and ε > 0; these yield a compact set *L* ⊂ *X* such that |*g*(*x*)| < ε for each *x* ∈/ *L*. Then ϕ(*L*) ⊂ *Y* is compact, so Urysohn gives us *f* ∈ *Cc*(*Y*) with 0 ≤ *f*(*y*) ≤ 1 for each *y* ∈ *Y* and *f*(*y*) = 1 for each *y* ∈ ϕ(*L*). Then:

$$\| (\boldsymbol{\mathfrak{g}}^\* f) \cdot \mathbf{g} - \mathbf{g} \|\_{\boldsymbol{\mathfrak{so}}} = \sup\_{\mathbf{x} \notin L} \{ |f(\boldsymbol{\mathfrak{g}}(\mathbf{x})) \mathbf{g}(\mathbf{x}) - \mathbf{g}(\mathbf{x})| \} < 2\varepsilon. \tag{7}$$

The (commutative) C\*-algebraic counterpart of this lemma is as follows:

Lemma C.44. *Let* α : *A* → *B be a nondegenerate homomorphism between commutative C\*-algebras. If* ω ∈ Σ(*B*)*, then* ω ◦ α ∈ Σ(*A*)*, and the ensuing pullback* α<sup>∗</sup> : Σ(*B*) → Σ(*A*) *is a continuous proper map between the two Gelfand spectra.*

*Proof.* Multiplicativity of ω ◦ α is clear, as α is a homomorphism. If ω ◦ α were identically zero, then (since ω is not), α(*a*) = 0 for each *a* ∈ *A*, which contradicts the assumption that α be nondegenerate. Continuity of α∗ follows from the fact that the Gelfand topology is the topology of pointwise convergence. Finally, in the present context, properness of α∗ is most appropriately derived as follows:


This suggests the following generalization of Theorem C.23:


$$\mathsf{ev}\_{X}: X \xrightarrow{\cong} \Sigma(\mathsf{C}\_{0}(X));\tag{\mathsf{C}.78}$$

$$G\_{\mathcal{A}} : A \xrightarrow{\cong} C\_0(\Sigma(A)),\tag{C.79}$$

*with similar naturalness properties as the corresponding maps in Theorem C.23.*

Categorically speaking, Theorem C.23 thus expanded states that *the category* LCHp *of locally compact Hausdorff spaces and proper continuous maps is dual to the category* CCAn *of commutative C\*-algebras and nondegenerate homomorphisms*.

*Proof.* Parts 1 and 2 are Lemmas C.43 and C.44, respectively; correct composition of the maps in question is easily checked (as simply as in the unital case).

Eq. (C.79) has already been proved, cf. (C.77). Similarly, using Proposition C.19 (with *X X*˙) and Lemma C.39 (with *A C*0(*X*)), we have

$$X \cup \{\rightsquigarrow\} = \check{X} \cong \Sigma(C(\check{X})) \cong \Sigma(\dot{C}\_0(X)) \cong \dot{\Sigma}(C\_0(X)) = \Sigma(C\_0(X)) \cup \mathfrak{a}\_{\curvearrowleft} . \tag{C.80}$$

Keeping track of the isomorphisms in question, it is easily verified that *X* and ∞ are mapped to Σ(*C*0(*X*)) and ω∞, respectively, and this proves (C.78).

Naturality follows from the unital case (Theorem C.23) and the following lemma:

Lemma C.46. *1. Let* α : *A* → *B be a nondegenerate homomorphism between commutative C\*-algebras. Then the following diagram commutes:*

$$
\begin{gathered}
\dot{\Sigma}(\mathcal{B}) \xrightarrow{s\_{\mathcal{B}}} \quad \Sigma(\mathcal{B}) \\
\Big\rfloor^{(\mathcal{A}^{\*})^{\top}} \quad \quad \Big\rfloor^{(\mathcal{A})^{\*}} \\
\dot{\Sigma}(\mathcal{A}) \xrightarrow{s\_{\mathcal{A}}} \quad \Sigma(\dot{\mathcal{A}}),
\end{gathered}
$$

*where sA and sB are defined in Lemma C.39,* α˙ *is defined in* (C.66)*, and* (α∗)· ≡ ϕ˙ *for* ϕ = α<sup>∗</sup> : Σ(*B*) → Σ(*A*)*, where the dot is defined as in Lemma C.41.4.*

*2. Let* ϕ : *X* → *Y be a proper continuous map between locally compact Hausdorff spaces. Then the following diagram commutes:*

$$
\begin{array}{ccc}
\dot{C}\_{0}(Y) & \xrightarrow{c\_{Y}} & C(\dot{Y}) \\
& \Big\downarrow\_{(\Phi^{\*})^{\top}} & & \Big\downarrow\_{(\Phi)^{\*}} \\
\dot{C}\_{0}(X) & \xrightarrow{c\_{X}} & C(\dot{X}),
\end{array}
$$

*where cX and cY are defined in the proof of Lemma C.38,* (ϕ∗)· ≡ α˙ *for* α = ϕ<sup>∗</sup> : *<sup>C</sup>*0(*Y*) <sup>→</sup> *<sup>C</sup>*0(*X*) *defined by* (C.66)*, and* <sup>ϕ</sup>˙ : *<sup>X</sup>*˙ <sup>→</sup> *Y is defined in Lemma C.41.4.* ˙

The proof is a diagram chase, but let us note that in clause 1 the role of nondegeneracy is to ensure that α∗ (and hence (α∗)· ) is *defined* in the first place (cf. Lemma C.44). Similarly, in clause 2, the properness assumption on ϕ ensures that ϕ∗ (and hence (ϕ∗)· ) is *defined*. Once defined, commutativity of these diagrams is obvious.

Finally, the property that LCHp is indeed a category is trivial (as the identity maps id : *X* → *X* are proper), but the corresponding fact for CCAn is not, for we need to show that the identity arrows id : *A* → *A* are nondegenerate. This comes down to the property that *<sup>A</sup>*<sup>2</sup> <sup>=</sup> *<sup>A</sup>*·*<sup>A</sup>* is dense in *<sup>A</sup>*. In fact, the situation is even better:

Lemma C.47. *In any C\*-algebra A one has A*<sup>2</sup> <sup>=</sup> *A (and hence An* <sup>=</sup> *A, n* <sup>∈</sup> <sup>N</sup>*).*

*Proof.* We prove that any self-adjoint *a* ∈ *A* takes the form

$$a = a\_1 a\_2,\tag{C.81}$$

for suitable *a*1,*a*<sup>2</sup> ∈ *A*. Since the linear span of such *a* is *A*, this proves the lemma.

We assume *<sup>A</sup>* has no unit, for otherwise the claim is trivial. We then embed *<sup>A</sup>* <sup>⊂</sup> *<sup>A</sup>*˙ and, for *<sup>a</sup>*<sup>∗</sup> <sup>=</sup> *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*, consider *<sup>C</sup>*∗(*a*) <sup>⊂</sup> *<sup>A</sup>*˙. We factor the identity function *<sup>t</sup>* → *<sup>t</sup>* on σ(*a*) ⊂ R as *t* = *f*1(*t*)*f*2(*t*) for some *fi* ∈ *C*(σ(*a*)), so that by Corollary C.26, we have (C.81) for *ai* = *fi*(*a*) ∈*C*∗(*a*). By the properties of the map *f* → *f*(*a*) mentioned in Corollary C.26, including the fact that *f*(1σ(*<sup>a</sup>*)) = 1*A*˙, it follows that if *f*(*a*) = *b*+ μ · 1*A*˙ for some *b* ∈ *A* and μ ∈ C, then *f*(0) = μ; note that 0 ∈ σ(*a*) by Proposition C.33. Consequently, imposing the additional condition *fi*(0) = 0 enforces *ai* ∈ *A*. -

Corollary C.48. *Each nondegenerate homomorphism* α : *C*0(*Y*) → *C*0(*X*) *is induced by a proper continuous map* ϕ : *X* → *Y via* α = ϕ∗*.*

*Proof.* Given (C.78), the proof is the same as for the compact case, cf. Corollary C.22. In particular, ϕ is given by (C.37), which map is proper because α∗ is proper by Lemma C.44 and ev*<sup>X</sup>* , and ev*<sup>X</sup>* are homeomorphisms. -

#### C.7 Positivity in C\*-algebras

We now turn to the important notion of *positivity*. First, we give two examples:


These examples are not as dissimilar as they might appear at first sight: *a* ∈ *B*(*H*) is positive iff its Gelfand transform idσ(*a*) = *a*ˆ is positive as a function in *C*(σ(*a*)); cf. Theorem C.24. Hence we have a notion of positivity for certain concrete C\* algebras, which we would like to generalize to arbitrary abstract C\*-algebras.

Definition C.49. *An element a of a C\*-algebra A is called* positive *when a* = *a*∗ *and its spectrum is positive; i.e.,* <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup>+*. We write a* <sup>≥</sup> <sup>0</sup> *when a is positive, and A*<sup>+</sup> *for the set of all positive elements in A.*

The basic structure of *A*<sup>+</sup> is captured by the following definition.

Definition C.50. *A* convex cone *in a real vector space V is a subspace V* <sup>+</sup> *such that:*

*1. If v* <sup>∈</sup> *<sup>V</sup>* <sup>+</sup> *and t* <sup>∈</sup> <sup>R</sup>+*, then tv* <sup>∈</sup> *<sup>V</sup>* <sup>+</sup>*. 2. If v*,*<sup>w</sup>* <sup>∈</sup> *<sup>V</sup>* <sup>+</sup>*, then v*+*<sup>w</sup>* <sup>∈</sup> *<sup>V</sup>* <sup>+</sup>*. 3. V* <sup>+</sup> ∩ −*<sup>V</sup>* <sup>+</sup> <sup>=</sup> {0}*.*

*A* linear partial ordering *in V is a partial ordering* ≤ *in which v* ≤ *w implies tv* <sup>≤</sup> *tw for all t* <sup>∈</sup> <sup>R</sup>+*, as well as v*+*<sup>u</sup>* <sup>≤</sup> *<sup>w</sup>*+*u for all u* <sup>∈</sup> *V .*

These structures are equivalent: A convex cone *<sup>V</sup>* <sup>+</sup> <sup>⊂</sup> *<sup>V</sup>* defines a linear partial ordering <sup>≤</sup> by *<sup>v</sup>* <sup>≤</sup> *<sup>w</sup>* if *<sup>w</sup>*−*<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* <sup>+</sup>, and conversely, <sup>≤</sup> yields *<sup>V</sup>* <sup>+</sup> <sup>=</sup> {*<sup>v</sup>* <sup>∈</sup> *<sup>V</sup>* <sup>|</sup> <sup>0</sup> <sup>≤</sup> *<sup>v</sup>*}.

Proposition C.51. *The set A*<sup>+</sup> *of all positive elements of a C\*-algebra A is a convex cone in the real vector space A*sa*, see* (C.6)*.*

*Proof.* Let *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*+. Property 1 follows from <sup>σ</sup>(*ta*) = *<sup>t</sup>*σ(*a*), which is a special case of (B.270). Since σ(*a*) ⊆ [0,*r*(*a*)], we have |*c* − λ| ≤ *c* for all λ ∈ σ(*a*) and all *<sup>c</sup>* <sup>≥</sup> *<sup>r</sup>*(*a*). Hence supλ∈σ(*a*) <sup>|</sup>*<sup>c</sup>* · <sup>1</sup>σ(*a*) <sup>−</sup>*a*ˆ(λ)| ≤ *<sup>c</sup>* by (C.22) and Theorem C.24, i.e., *c* · 1σ(*a*) −*a*ˆ<sup>∞</sup> ≤ *c*. Gelfand transforming back to *C*∗(*a*), by (C.18) this implies

$$\|c \cdot 1\_A - a\| \le c,\tag{C.82}$$

for all *c* ≥ *a*. Inverting this, one sees that if (C.82) holds for some *c* ≥ *a*, then <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup>+. Use this with *<sup>a</sup> <sup>a</sup>*+*<sup>b</sup>* and *<sup>c</sup>* <sup>=</sup> *a*+*b*, so *<sup>c</sup>* ≥ *a*+*b*. Then

$$\left\| \left| c \cdot 1\_A - (a+b) \right| \right\| \le \left\| \left( \left| \left| a \right| \right| - a \right) \right\| + \left\| \left( \left| b \right| \right| - b \right) \right\| \le c,$$

where in the last step we used the previous paragraph for *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> and *<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> separately. As for *<sup>a</sup>*, this inequality implies *<sup>a</sup>* <sup>+</sup> *<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>*+. Finally, when *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> and *<sup>a</sup>* ∈ −*A*<sup>+</sup> it must be that <sup>σ</sup>(*a*) = {0}, hence *<sup>a</sup>* <sup>=</sup> 0 by (B.257) and Definition A.1. - For example, when *a* = *a*∗ one checks the validity of the important inequalities

$$-||a|| \cdot 1\_A \le a \le ||a|| \cdot 1\_A,\tag{C.83}$$

by taking the Gelfand transform of *C*∗(*a*). This also yields the implication

$$-b \le a \le b \implies \|a\| \le \|b\|,\tag{C.84}$$

because the antecedent and (C.83) with *a b* yield −*b*·1*<sup>A</sup>* ≤ *a* ≤ ·*b*1*A*, so that σ(*a*) ⊆ [−*b*,*b*], hence *a*≤*b* by (B.257) and (B.254).

We now come to the central result in the theory of positivity in C\*-algebras, which generalizes the cases *A* = *B*(*H*) and *A* = *C*0(*X*) discussed at the beginning.

Theorem C.52. *With A*<sup>+</sup> <sup>=</sup> {*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>* <sup>|</sup> *<sup>a</sup>* <sup>≥</sup> <sup>0</sup>} *as in Definition C.49, one has*

$$A^{+} = \{a^{2} \mid a^{\*} = a\} \tag{C.85}$$

$$\mathfrak{l} = \{a^\*a | a \in A\}. \tag{C.86}$$

*Proof.* If <sup>σ</sup>(*a*) <sup>⊂</sup> <sup>R</sup><sup>+</sup> and *<sup>a</sup>* <sup>=</sup> *<sup>a</sup>*∗, then <sup>√</sup>*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>* is defined by Corollary C.26 for *<sup>f</sup>* <sup>=</sup> √·, and satisfies <sup>√</sup>*<sup>a</sup>* <sup>2</sup> <sup>=</sup> *<sup>a</sup>*. Hence *<sup>A</sup>*<sup>+</sup> ⊆ {*a*<sup>2</sup> <sup>|</sup> *<sup>a</sup>*<sup>∗</sup> <sup>=</sup> *<sup>a</sup>*}. The opposite inclusion follows from (C.53) and Corollary C.27. This proves (C.85).

Towards (C.86), the inclusion *<sup>A</sup>*<sup>+</sup> ⊆ {*a*∗*a*|*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*} is trivial from (C.85).

Lemma C.53. *Each selfadjoint element a has a unique decomposition*

$$a = a\_+ - a\_-,\tag{C.87}$$

*where a*+,*a*<sup>−</sup> <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> *and a*+*a*<sup>−</sup> <sup>=</sup> <sup>0</sup>*. Moreover, a*±≤*a* <sup>=</sup> max{*a*+,*a*−}*.*

*Proof.* Apply Corollary C.26 with *f* = idσ(*a*) = *f*<sup>+</sup> − *f*−, where idσ(*<sup>a</sup>*)(*t*) = *t* and *f*±(*t*) = max{±*t*,0}. The norm property follows from (C.52). Uniqueness follows from the corresponding property in *C*(σ(*a*)), where it is obvious. -

Apply the lemma to *a* = *b*∗*b* (noting that *a* is selfadjoint). Then

$$a\_-(a\_-)^3 = -a\_-(a\_+ - a\_-)a\_- = -a\_-aa\_- = -a\_-b^\*ba\_- = -(ba\_-)^\*ba\_-. \tag{C.88}$$

Since <sup>σ</sup>(*a*−) <sup>⊂</sup> <sup>R</sup><sup>+</sup> because *<sup>a</sup>*<sup>−</sup> is positive, we see from (C.53) with *<sup>f</sup>*(*t*) = *<sup>t</sup>* <sup>3</sup> that (*a*−)<sup>3</sup> <sup>≥</sup> 0. Hence <sup>−</sup>(*ba*−)∗*ba*<sup>−</sup> <sup>≥</sup> 0.

Lemma C.54. *If* <sup>−</sup>*c*∗*<sup>c</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> *for some c* <sup>∈</sup> *A then c* <sup>=</sup> <sup>0</sup>*.*

*Proof.* We can write *c* = *d* +*ie*, *d* and *e* selfadjoint, so that

$$c^\*c = 2d^2 + 2e^2 - cc^\*.\tag{C.89}$$

Now for any *a*,*b* ∈ *A* one has

$$
\sigma(ab) \cup \{0\} = \sigma(ba) \cup \{0\}.\tag{C.90}
$$

This is because for *z* = 0, invertibility of *ab*−*z* implies invertibility of *ba*−*z*; indeed,

$$(ba - z)^{-1} = z^{-1}(b(ab - z)^{-1}a - 1\_A). \tag{C.91}$$

Applying (C.90) with *a c* and *b c*∗, it follows that σ(*c*∗*c*) ⊂ R<sup>−</sup> implies <sup>σ</sup>(*cc*∗) <sup>⊂</sup> <sup>R</sup>−, hence <sup>σ</sup>(−*cc*∗) <sup>⊂</sup> <sup>R</sup>+. By (C.85) and Proposition C.51 (applied to Definition C.50.2), eq. (C.89) then implies that *<sup>c</sup>*∗*<sup>c</sup>* <sup>≥</sup> 0, i.e., <sup>σ</sup>(*c*∗*c*) <sup>⊂</sup> <sup>R</sup>+, so that the assumption <sup>−</sup>*c*∗*<sup>c</sup>* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> now yields <sup>σ</sup>(*c*∗*c*) = 0. Hence *<sup>c</sup>* <sup>=</sup> 0 by Proposition C.51 applied to Definition C.50.3. -

By this lemma, the last claim preceding it implies *ba*<sup>−</sup> = 0. As

$$(a\_{-})^3 = -(ba\_{-})^\*ba\_{-} = 0,\tag{C.92}$$

we see that (*a*−)<sup>3</sup> <sup>=</sup> 0, and finally *<sup>a</sup>*<sup>−</sup> <sup>=</sup> 0 by Corollary C.26 with *<sup>f</sup>*(*t*) = *<sup>t</sup>* <sup>1</sup>/3. Hence *<sup>b</sup>*∗*<sup>b</sup>* <sup>=</sup> *<sup>a</sup>*<sup>+</sup> <sup>∈</sup> *<sup>A</sup>*+. Thus {*a*∗*a*|*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*} ⊆ *<sup>A</sup>*+, which ends the proof of Theorem C.52. -

An important consequence of (C.86) is the fact that inequalities *a*<sup>1</sup> ≤ *a*<sup>2</sup> for selfadjoint *a*1,*a*<sup>2</sup> are stable under conjugation by arbitrary elements *b* ∈ *A*, so that *a*<sup>1</sup> ≤ *a*<sup>2</sup> implies *b*∗*a*1*b* ≤ *b*∗*a*2*b*. This is because *a*<sup>1</sup> ≤ *a*<sup>2</sup> is the same as *a*<sup>2</sup> −*a*<sup>1</sup> ≥ 0, and hence by (C.86) there is an *a*<sup>3</sup> ∈ *A* such that *a*<sup>2</sup> −*a*<sup>1</sup> = *a*<sup>∗</sup> <sup>3</sup>*a*3. But (*a*3*b*)∗*a*3*b* ≥ 0, i.e., *b*∗*ab* ≤ *b*∗*a*2*b*. For example, replace *a* in (C.83) by *a*∗*a*, and use (C.2), yielding *<sup>a</sup>*∗*<sup>a</sup>* ≤ *a*21*A*. Applying the above principle gives the operator inequality

$$b^\*a^\*ab \le ||a||^2b^\*b \ (a,b \in A). \tag{C.93}$$

We note that the definition of a state implies that if *a* ≤ *b*, then ω(*a*) ≤ ω(*b*), so that

$$a\mathfrak{o}(b^\*a^\*ab) \le ||a||^2\mathfrak{o}(b^\*b),\tag{C.94}$$

from (C.93). This is a key lemma for the GNS-construction (cf. Theorem C.88).

At last, we are also in a position to prove the fundamental Lemma C.4.

*Proof.* If ω is positive and *a*<sup>∗</sup> = *a*, then (C.83) in the form *a* · 1*<sup>A</sup>* ± *a* ≥ 0 gives ω(*a*) ≤ *a*ω(1*A*), and hence ω(*a*) ∈ R. For general *a* ∈ *A*, eq. (C.8) then implies ω(*a*∗) = ω(*a*) (which may alternatively be proved from Lemma C.53). This, in turn, makes the form (C.24) hermitian. Cauchy–Schwarz then gives |ω(*a*)| 2 ≤ ω(*a*∗*a*)ω(1*A*), as in (C.25). Furthermore, if *a* ≤ 1 then also *a*∗*a* ≤ 1 by (C.2), so that (C.83) gives ω(*a*∗*a*) ≤ ω(1*A*). Combining these inequalities yields |ω(*a*)| ≤ ω(1*A*), so ω is bounded with ω ≤ ω(1*A*); taking *a* = 1*<sup>A</sup>* gives equality.

Conversely, assume that ω = ω(1*A*) = 1. In proving that ω(*a*) ≥ 0 whenever *a* ≥ 0, we may also assume that 0 ≤ *a* ≤ 1*A*. Then (C.7) shows that α ≡ ω(*a*) ∈ R. Also, we have σ(*a*) ⊆ [0,1] and hence σ(1*<sup>A</sup>* − *a*) ⊆ [0,1], which in turns implies 0 ≤ (1*<sup>A</sup>* −*a*) ≤ 1*A*, and hence 1*<sup>A</sup>* −*a* ≤ 1, cf. (C.84). Then

$$1 - \mathfrak{a} \le |1 - \mathfrak{a}| = |\mathfrak{o}(1\_A - a)| \le \|\mathfrak{o}\| \|\|1\_A - a\| \le 1,\tag{C.95}$$

whence α ≥ 0, and hence ω(*a*) ≥ 0. -

$$\square$$

#### C.8 Ideals in Banach algebras

This section returns to general Banach algebras. It has two aims: it completes the (first) proof of Theorem C.8, and it prepares for the theory of ideals in C\*-algebras.

Definition C.55. *Let A be a Banach algebra.*


Thus *an ideal is closed by definition*. However, it is useful to know that if we omit the word 'closed' throughout Definition C.55, a *maximal* ideal *J* ⊂ *A* (defined in the purely algebraic sense) is automatically closed. Indeed, note that the closure *J* of *J* cannot be *A*, since *J* does not contain any invertible element of *A* (otherwise it would coincide with *A*), and the set *A*<sup>∗</sup> of all invertible elements in *A* is open (see the proof of Theorem B.84). Since *J* ⊆ *J* ⊂ *A* and *J* is maximal, *J* = *J*.

Furthermore, one often uses the fact that an ideal *J* that contains an invertible element *a* must coincide with *A* (since *a*−1*a* = 1*<sup>A</sup>* must then lie in *J*, whence *J* = *A*).

In the commutative case, left and right ideals are the same as ideals. For example, if *A* =*C*(*X*) for a compact space *X*, then each closed subspace *Y* ⊂ *X* defines an ideal

$$\mathcal{C}(X;Y) = \{ f \in \mathcal{C}(X) \mid f(\mathbf{x}) = \mathbf{0} \,\forall \mathbf{x} \in Y \}. \tag{\text{C.96}}$$

Note that *C*(*X*;*Y*) is indeed closed by definition of the sup-norm, and that

$$C(X;Y) \cong C\_0(X \backslash Y). \tag{C.97}$$

Proposition C.83 in §C.11 shows that all ideals in *C*(*X*) are of this form. It is not necessary to assume that *Y* is closed, but this assumption entails no loss of generality, since *C*(*X*;*Y*) = *C*(*X*;*Y*), where *Y* is the closure of *Y*. We will see that *C*(*X*;*Y*) is maximal iff *Y* is a point, and that all *maximal* ideals in *C*(*X*) are of this form.

The next proposition is predicated on an elementary Banach space result:

Lemma C.56. *If V is a Banach space and W is a* closed *linear subspace of V , then the vector space quotient V*/*W is a Banach space in the "distance to W " norm*

$$\left\|\left|\mathfrak{T}(\nu)\right|\right\| = \inf\_{\mathbf{w}\in W} \left\|\nu - \mathbf{w}\right\|,\tag{C.98}$$

*where* τ : *V* → *V*/*W is the canonical projection. Also,* τ(*v*)≤*v for any v* ∈ *V .*

*Proof.* First, (C.98) is well defined, for if τ(*v* ) = τ(*v*), i.e., *v*−*v* = *w* ∈ *W*, then

$$\begin{aligned} \|\mathfrak{T}(\nu')\| &= \inf\{ \|\nu' - \boldsymbol{w}\|, \boldsymbol{w} \in W \} = \inf\{ \|\boldsymbol{\nu'} - \boldsymbol{w} - \boldsymbol{w}'\|, \boldsymbol{w} \in W \} \\ &= \inf\{ \|\boldsymbol{\nu} - \boldsymbol{w}\|, \boldsymbol{w} \in W \} = \|\boldsymbol{\tau}(\boldsymbol{\nu})\|. \end{aligned}$$

The axioms for a norm are easily verified, except positive definiteness: we have τ(*v*) = 0 iff inf{*v*−*w*,*w* ∈ *W*} = 0; hence there must be a sequence (*wn*) in *W* with *v*−*wn* → 0, or *wn* → *v*. Since *W* is closed, *v* ∈ *W*, so that τ(*v*) = 0. For the last claim, eq. (C.98) yields [τ(*v*)≤*v*−*w* for all *w* ∈ *W*; take *w* = 0.

There seems to be no natural proof of the completeness of *V*/*W*, but here is a trick: for any Cauchy sequence (τ(*vn*))*<sup>n</sup>* in *V*/*W*, find a subsequence (τ(*vnk* ))*<sup>k</sup>* with τ(*vnk*+<sup>1</sup> )−τ(*vnk* ) <sup>&</sup>lt; <sup>2</sup>−*<sup>k</sup>* for all *<sup>k</sup>*. Using induction in *<sup>k</sup>*, one finds a sequence (*uk*) in *<sup>V</sup>* with <sup>τ</sup>(*uk*) = <sup>τ</sup>(*vnk* ) and *uk*+<sup>1</sup> <sup>−</sup>*uk* <sup>&</sup>lt; <sup>2</sup>−*k*. Hence *uk* <sup>→</sup> *<sup>u</sup>* (since *<sup>V</sup>* is complete), and hence τ(*vnk* ) → τ(*u*) by continuity of τ. Then also (τ(*vn*))*<sup>n</sup>* → 0. -

Proposition C.57. *If J is an ideal in a Banach algebra A, then the quotient A*/*J is a Banach algebra with multiplication*

$$
\pi(a)\pi(b) = \pi(ab). \tag{C.99}
$$

*If A is unital and J is proper, A*/*J is unital, with unit* τ(1*A*) *satisfying*

$$\|\|\pi(1\_A)\|\| = 1.\tag{C.100}$$

*Proof.* As far as the Banach algebra structure is concerned, first note that (C.99) is well defined: when *j*1, *j*<sup>2</sup> ∈ *J* one has

$$
\pi(a+j\_1)\pi(b+j\_2) = \pi(ab+aj\_2+j\_1b+j\_1j\_2) = \pi(ab) = \pi(a)\pi(b), \quad (\text{C.101})
$$

since *a j*<sup>2</sup> + *j*1*b*+ *j*<sup>1</sup> *j*<sup>2</sup> ∈ *J* by definition of an ideal, and τ(*j*) = 0 for all *j* ∈ *J*.

To prove (C.1), observe that, by definition of the infimum, for given *a* ∈ *A*, for each ε > 0 there exists a *j* ∈ *J* such that

$$\|\|\pi(a)\| + \mathfrak{c} \ge \|a + j\|. \tag{C.102}$$

For if such a *j* would not exist, then τ(*a*)≤*a* + *j* − ε for all *j* ∈ *J*, violating (C.98). On the other hand, for any *j* ∈ *J*, it is clear from (C.98) that

$$\|\|\pi(a)\|\| = \|\pi(a+j)\| \le \|a+j\|. \tag{C.103}$$

For *a*,*b* ∈ *A*, choose ε > 0 and *j*1, *j*<sup>2</sup> ∈ *J* such that (C.102) holds for *a*,*b*, and estimate

$$\begin{aligned} \|\mathfrak{f}(a)\mathfrak{f}(b)\| &= \|\mathfrak{f}(a+j\_1)\mathfrak{f}(b+j\_2)\| = \|\mathfrak{f}((a+j\_1)(b+j\_2))\| \\ &\le \|(a+j\_1)(b+j\_2)\| \le \|a+j\_1\| \, \|b+j\_2\| \\ &\le (\|\mathfrak{f}(a)\| + \mathfrak{e})(\|\mathfrak{f}(b)\| + \mathfrak{e}). \end{aligned} \tag{C.104}$$

Letting ε → 0 yields

$$\|\mathfrak{r}(a)\mathfrak{r}(b)\| \le \|\mathfrak{r}(a)\| \,\, \|\mathfrak{r}(b)\|.\tag{C.105}$$

If *A* has a unit, τ(1*A*) is a unit in *A*/*J*, cf. (C.99). By (C.103) with *a* = 1*<sup>A</sup>* and *j* = 0 one has τ(1*A*)≤1*A* = 1. On the other hand, from (C.105) and (C.99) with *b* = 1*<sup>A</sup>* and *a* ∈ *A*\*J*, one derives τ(1*A*) ≥ 1. Hence (C.100) follows. - In a C\*-algebra the last step is unnecessary, since a unit necessarily has norm one.

In the commutative case, a nice example (with *X* and *Y* compact, as above), is

$$\mathcal{C}(X)/\mathcal{C}(X;Y) \cong \mathcal{C}(Y),\tag{C.106}$$

as two elements *f*,*g* of *C*(*X*) are identified in *C*(*X*)/*C*(*X*;*Y*) when *f* −*g* ∈ *C*(*X*;*Y*), i.e., when they coincide on *Y*. If one looks at *C*(*X*;*Y*) as the kernel of the restriction map *rY* : *C*(*X*) → *C*(*Y*), then ran(*rY* ) ∼= *C*(*X*)/ker(*rY* ), which is just (C.106).

We now prove Proposition C.13, which we unfold as:


For the first claim, *J*<sup>ω</sup> is an ideal since ω is multiplicative. To prove maximality, suppose *J*<sup>ω</sup> ⊆ *I* ⊂ *A* for some ideal *I*. Then ω(*I*) is an ideal in C, so either ω(*I*) = {0} or ω(*I*) = C. In the former case, *I* = *J*<sup>ω</sup> (since *I* ⊆ ker<sup>ω</sup> = *J*), in the latter, *I* = *A* (because for any *a* ∈ *A* there is *b* ∈ *I* such that ω(*a*) = ω(*b*), whence *a*−*b* ∈ ker<sup>ω</sup> and hence *a*−*b* ∈ *I*, or *a* ∈ *b*+*I* = *I*). Thus *J*<sup>ω</sup> is maximal.

For the second, if ω1(*a*) = *c*, then ω1(*a*−*c* · 1*A*) = 0 by (C.14), so if ker(ω1) = ker(ω2), then also ω2(*a*−*c* · 1*A*) = 0 and hence ω2(*a*) = *c* = ω1(*a*).

Finally, let *J* be maximal. Since *J* = *A*, there is a nonzero *b* ∈ *A*, *b* ∈/ *J*. Form

$$J\_b = \{ ba + j \mid a \in A, j \in J \}. \tag{C.107}$$

Since *A* is commutative, *Jb* is an ideal. Taking *a* = 0 gives *J* ⊆ *Jb*. Taking *a* = 1*<sup>A</sup>* and *j* = 0 gives *b* ∈ *Jb*, so that *Jb* = *J*. Hence *Jb* = *A*, as *J* is maximal. In particular, 1*<sup>A</sup>* ∈ *Jb*, so that 1*<sup>A</sup>* = *ba*+ *j* for some *a* ∈ *A*, *j* ∈ *J*. Applying τ : *A* → *A*/*J* gives

$$
\pi(1\_A) = 1\_A = \pi(ba) = \pi(b)\pi(a), \tag{C.108}
$$

because of (C.99) and <sup>τ</sup>(*J*) = 0. Hence <sup>τ</sup>(*a*) = <sup>τ</sup>(*b*)−<sup>1</sup> in *<sup>A</sup>*/*J*. Since *<sup>b</sup>* <sup>=</sup> 0 was arbitrary, this shows that every nonzero element of *A*/*J* is invertible. At this point it is therefore appropriate to invoke the *Gelfand–Mazur Theorem*:

Theorem C.58. *If every nonzero element of a unital commutative Banach algebra B is invertible (i.e., if B is simple), then B* ∼= C *as Banach algebras.*

*Proof.* Since σ(*b*) = 0, for each / *b* = 0 there is λ ∈ C for which *b* − λ · 1*<sup>B</sup>* is not invertible. Hence *b*−λ · 1*<sup>B</sup>* = 0 by assumption, and *b* → λ is an isomorphism. -

Hence there is an isomorphism ψ : *A*/*J* → C, from which we define ω : *A* → C by ω(*a*) = ψ(τ(*a*)). This map is clearly linear (since τ and ψ are), and nonzero (because ω(1*A*) = 1). Also, ω(*a*)ω(*b*) = ω(*ab*) by (C.99) and the fact that ψ is a homomorphism, so ω ∈ Σ(*A*). Finally, since ker(τ) = *J* and ψ is an isomorphism, *J* = ker(ω). This proves claim 3 above, and therefore Proposition C.13 also follows.

#### C.9 Ideals in C\*-algebras

Definition C.55 *verbatim* applies to C\*-algebras. One would expect that an ideal in a C\*-algebra is required to be selfadjoint by definition, but this is unnecessary:

Proposition C.59. *Let J be an ideal in a C\*-algebra A. If a* ∈ *J then a*<sup>∗</sup> ∈ *J; in other words, every ideal in a C\*-algebra is automatically selfadjoint.*

The proof (which generalizes a similar argument for compact operators, given at the end of §B.18) relies on the theory of approximate units (see §C.5).

*Proof.* Let *J* ⊂ *A* be the given ideal, and put *J*<sup>∗</sup> = {*a*∗|*a* ∈ *J*}. Note that *j* ∈ *J* implies *j* <sup>∗</sup> *j* ∈ *J* ∩*J*∗: it lies in *J* because *J* is an ideal, hence a left-ideal, and it lies in *J*<sup>∗</sup> because *J*<sup>∗</sup> is an ideal, hence a right-ideal. Since *J* is an ideal, *J*∩*J*<sup>∗</sup> is a C\*-subalgebra of *A*. Hence by C.36 it has an approximate unit {1<sup>λ</sup> }. Take *j* ∈ *J*. Using (C.2),

$$\begin{split} \| \| j^\* - j^\* 1\_{\lambda} \| ^2 &= \| (j - 1\_{\lambda} j)(j^\* - j^\* 1\_{\lambda}) \| \\ &= \| (j^\* j - j^\* j 1\_{\lambda}) \| + \| 1\_{\lambda} \| \| (j j^\* - j j^\* 1\_{\lambda}) \|, \end{split} \tag{C.109}$$

since 1∗ <sup>λ</sup> = 1<sup>λ</sup> . As we have seen, *j* <sup>∗</sup> *j* ∈ *J* ∩*J*∗, so that, also using (C.69), both terms vanish for λ → ∞. Hence limλ→<sup>∞</sup> *j* <sup>∗</sup> − *j* <sup>∗</sup>1<sup>λ</sup> = 0. But 1<sup>λ</sup> lies in *J*∩*J*∗, so certainly 1<sup>λ</sup> ∈ *J*, and since *J* is an ideal it must be that *j* <sup>∗</sup>1<sup>λ</sup> ∈ *J* for all λ. Hence *j* ∗ is a normlimit of elements in *J*; since *J* is closed, it follows that *j* <sup>∗</sup> ∈ *J*. -

We now turn to a C\*-algebraic analogue of Proposition C.57, which is of sufficient importance to promote it to the status of a theorem:

Theorem C.60. *Let J be an ideal in a C\*-algebra A. Then A*/*J is a C\*-algebra with respect to the norm (C.98), the multiplication (C.99), and the involution*

$$
\pi(a)^\* = \pi(a^\*). \tag{C.110}
$$

The proof of this theorem uses approximate units, too. In view of Proposition C.57, all we need to prove to establish Theorem C.60 is the property (C.2). This uses:

Lemma C.61. *Let* {1<sup>λ</sup> } *be an approximate unit for J, and let a* ∈ *A. Then*

$$\|\mathfrak{T}(a)\| = \lim\_{\lambda \to \infty} \|a - a\mathbf{1}\_{\lambda}\|. \tag{C.111}$$

*Proof.* It is obvious from (C.98) that

$$\|a - a\mathbf{1}\_{\lambda}\| \ge \|\mathfrak{r}(a)\|.\tag{\text{C.112}}$$

For the opposite inequality, add a unit 1*<sup>A</sup>* to *A* if necessary, pick any *j* ∈ *J*, and write

$$\|a - a\mathbf{1}\_{\lambda}\| = \|(a + j)(1 - \mathbf{1}\_{\lambda}) + j(\mathbf{1}\_{\lambda} - \mathbf{1})\| \le \|a + j\| \|1 - \mathbf{1}\_{\lambda}\| + \|j\mathbf{1}\_{\lambda} - j\|. \tag{C.113}$$

Note that

$$\|1 - 1\_{\lambda}\| \le 1,\tag{C.114}$$

by Definition C.35 and the proof of Proposition C.51. The second term on the righthand side goes to zero for λ → ∞, since *j* ∈ *J*. Hence

$$\lim\_{\lambda \to \infty} \|a - a\mathbf{1}\_{\lambda}\| \le \|a + j\|. \tag{C.115}$$

For each ε > 0 we can choose *j* ∈ *J* so that (C.102) holds. For this specific *j*, we combine (C.112), (C.115), and (C.102) to find

$$\lim\_{\lambda \to \infty} \|a - a\mathbf{1}\_{\lambda}\| - \mathfrak{e} \le \|\mathfrak{r}(a)\| \le \|a - a\mathbf{1}\_{\lambda}\|. \tag{C.116}$$

Letting ε → 0 proves (C.111). -

We now prove (C.2) in *A*/*J*. Successively using (C.111), (C.2) in *A*˙, (C.114), (C.111), (C.99), and (C.110), we find

$$\begin{split} \left\lVert \begin{aligned} \left\lVert \tau(a) \right\rVert^{2} = \lim\_{\lambda \to \infty} \left\lVert a - a \right\lambda\_{\lambda} \right\rVert^{2} &= \lim\_{\lambda \to \infty} \left\lVert (a - a \mathbf{1}\_{\lambda})^{\*} (a - a \mathbf{1}\_{\lambda}) \right\rVert \\ &= \lim\_{\lambda \to \infty} \left\lVert (\mathbf{1}\_{A} - \mathbf{1}\_{\lambda}) a^{\*} a (\mathbf{1}\_{A} - \mathbf{1}\_{\lambda}) \right\rVert \le \lim\_{\lambda \to \infty} \left\lVert \left\lVert -\mathbf{1}\_{\lambda} \right\rVert \right\rVert \left\lVert a^{\*} a (\mathbf{1}\_{A} - \mathbf{1}\_{\lambda}) \right\rVert \\ &\le \lim\_{\lambda \to \infty} \left\lVert a^{\*} a (\mathbf{1}\_{A} - \mathbf{1}\_{\lambda}) \right\rVert = \left\lVert \tau(a^{\*} a) \right\rVert = \left\lVert \tau(a) \tau(a^{\*}) \right\rVert \\ &= \left\lVert \tau(a) \tau(a)^{\*} \right\rVert. \end{split} \tag{C.117}$$

As in the proof of Proposition C.30, this implies (C.2), and hence Theorem C.60.-

We now state and prove the key result about morphisms.

Theorem C.62. *Let* α : *A* → *B be a nonzero homomorphism between C\*-algebras.*


*Proof.* If necessary, we first reduce the proof of the first claim to the case where *A* and *B* have units and α is unital: we do so by replacing *A* and *B* by *A*˙ and *B*˙, respectively (even if *A* and/or *B* was already unital in the first place, but α was not), and replacing <sup>α</sup> by the homomorphism <sup>α</sup>˙ : *<sup>A</sup>*˙ <sup>→</sup> *<sup>B</sup>*˙ defined in (C.66). If we do so, it follows from Lemma C.34 that in the worst case the spectrum of *a* or α(*a*) is modified by adding 0, which does not change the spectral radius. Therefore, the move from α to α˙ makes no difference to the argument to follow, so we assume that 1*<sup>A</sup>* <sup>∈</sup> *<sup>A</sup>* and 1*<sup>B</sup>* <sup>∈</sup> *<sup>B</sup>*, and <sup>α</sup>(1*A*) = <sup>1</sup>*B*. If *<sup>z</sup>* <sup>∈</sup> <sup>ρ</sup>(*a*), so that (*<sup>a</sup>* <sup>−</sup> *<sup>z</sup>*)−<sup>1</sup> exists in *<sup>A</sup>*, then <sup>α</sup>(*<sup>a</sup>* <sup>−</sup> *<sup>z</sup>*) is certainly invertible in *<sup>B</sup>*, for (C.4) implies that (α(*<sup>a</sup>* <sup>−</sup> *<sup>z</sup>*))−<sup>1</sup> <sup>=</sup> <sup>α</sup>((*a*−*z*)−1). Hence <sup>ρ</sup>(*a*) <sup>⊆</sup> <sup>ρ</sup>(α(*a*)), so that

$$
\sigma(\mathfrak{a}(a)) \subseteq \sigma(a). \tag{C.118}
$$

Replacing *a* by *a*∗*a* this gives *r*(α(*a*∗*a*)) ≤ *r*(*a*∗*a*), and since α(*a*∗*a*) = α(*a*)∗α(*a*), eq. (C.55) yields α(*a*)≤*a*, and hence α ≤ 1. This proves continuity of α.

Recalling that ideals in C\*-algebras have to be closed by definition, this also implies the second claim of the theorem (whose algebraic content is trivial).

We now prove the third claim of the theorem (which trivially implies the fourth). Assume there is *b* ∈ *A* for which α(*b*) = *b*, so that by σ(*a*) = σ(α(*a*)) for *a* = *b*∗*b* by (C.55). Then (C.118) implies the strict inclusion σ(α(*a*)) ⊂ σ(*a*)(as a closed subset). By Urysohn's lemma, there is a nonzero function *f* ∈*C*(σ(*a*)) that vanishes on σ(α(*a*)), so that *f*(α(*a*)) = 0 by Corollary C.26. By Lemma C.63 below, this implies α(*f*(*a*)) = 0. If α is injective, this contradicts the property *f*(*a*) = 0, which follows from *f* = 0 and (C.52). Thus α must be isometric.

Combining the second claim with Theorem C.60, we see that *A*/ker(α) is a C\* algebra. By the theory of vector spaces, we have a vector space isomorphism

$$
\Psi: A/\ker(\mathfrak{a}) \to \mathfrak{a}(A),
\tag{C.119}
$$

so that

$$
\Psi \circ \mathfrak{r} = \mathfrak{a}.\tag{\text{C.120}}
$$

Since α and τ are homomorphisms between C\*-algebras, so is ψ. Since ψ is injective, it is isometric, as we have just shown. Hence ψ(*A*/ker(α)) has closed range in *B*. But ψ(*A*/ker(α)) = α(*A*), so that α has closed range in *B*. Since α is a morphism, its image is a ∗-algebra in *B*, which by the preceding sentence is closed in the norm of *B*. Hence α(*A*), inheriting all operations in *B*, is a C\*-algebra.

Finally, we prove that for the projection τ : *A* → *A*/*J* in the case at hand we have

$$\|\boldsymbol{\pi}\| = 1.\tag{C.121}$$

If *A* has a unit, this follows from Lemma C.56 with (C.100). If not, the argument is similar, using an approximate identity (1<sup>λ</sup> ) for *A*: from (C.105) we obtain lim<sup>λ</sup> τ(1<sup>λ</sup> ≥ 1, which with (C.69) gives sup<sup>λ</sup> τ(1<sup>λ</sup> = 1. Since τ ≤ 1 from Lemma C.56, this yields (C.121).

Because ψ is an isometry, it then follows from (C.120) that α = 1. -

Here we used a nice property of the continuous functional calculus (Theorem C.25):

Lemma C.63. *If* α : *A* → *B is a morphism, and a* = *a*∗*, then*

$$f(\mathfrak{a}(a)) = \mathfrak{a}(f(a)) \ (f \in C(\mathfrak{a}(a))).\tag{C.122}$$

Here *f*(*a*) and *f*(α(*a*)) are defined through Theorem C.25, cf. (C.118).

*Proof.* The property is true for polynomials by (C.4), since for those functions, *f*(*a*) and *f*(α(*a*)) have their naive meaning. The general claim follows by continuity. -

Corollary C.64. *Every ideal in a C\*-algebra is the kernel of some homomorphism.*

*Proof.* This follows from Proposition C.59, since *J* is the kernel of τ : *A* → *A*/*J*, where *A*/*J* is a C\*-algebra and τ is a morphism by (C.99), and (C.110). -

#### C.10 Hilbert C\*-modules and multiplier algebras

In §C.5 we explained the *minimal* way of adding a unit to a C\*-algebra that did not have one to begin with (although the procedure even works if it does). There is also a *maximal* way, which embeds a non-unital C\*-algebra in its *multiplier algebra*. In our view, this maximal extension is actually more elegant and useful than the minimal one, although the commutative case might give the oppositie impression: here (as we have seen), the minimal extension corresponds to the simple one-point compactification of the Gelfand spectrum, whereas the maximal one extends the latter to its awesome Cech–Stone compactification. In topology one may doubt if the latter is ˇ indeed the neater choice, but for many noncommutative C\*-algebras the multiplier algebra comes naturally. For example, the C\*-algebra *B*0(*H*) of compact operators on a Hilbert space *H* is thereby turned into the C\*-algebra *B*(*H*) of bounded ones.

There are various ways of defining multiplier algebras. Although not strictly necessary, we offer the powerful entrance provided by Hilbert C\*-modules, which are simultaneous generalizations of C\*-algebras, Hilbert spaces, and vector bundles.

#### Definition C.65. *A* pre-Hilbert C\*-module *over a C\*-algebra A consists of:*

• *A right A-module E, i.e., a complex linear space equipped with a bilinear map E* ×*A* → *A, written* (ψ,*a*) → ψ*a (where* ψ ∈ *E and a* ∈ *A) such that*

$$(\Psi b)a = \Psi (ba). \tag{C.123}$$

• *A map* , *<sup>A</sup>* : *E* ×*E* → *A, linear in the second entry (the axioms below implying antilinearity in the first entry) that for all* ψ,ϕ ∈ *E and b* ∈ *A, satisfies*

$$
\langle \Psi, \Phi \rangle\_A^\* = \langle \Phi, \Psi \rangle\_A; \tag{C.124}
$$

$$
\langle \Psi, \Phi a \rangle\_A = \langle \Psi, \Phi \rangle\_A a;\tag{C.125}
$$

$$<\langle \Psi, \Psi \rangle\_{A} \ge 0;\tag{C.126}$$

$$
\langle \Psi, \Psi \rangle\_A = 0 \iff \Psi = 0. \tag{C.127}
$$

It is useful to note that (C.124) and (C.125) imply that

$$
\langle \Psi a, \Phi \rangle\_A = a^\* \langle \Psi, \Phi \rangle\_A. \tag{C.128}
$$

Lemma C.66. *In a pre-Hilbert C\*-module E over a C\*-algebra A one has:*

$$
\langle \Psi, \mathfrak{q} \rangle\_A \langle \mathfrak{q}, \mathfrak{q} \rangle\_A \le ||\mathfrak{q}||^2 \langle \Psi, \mathfrak{q} \rangle\_A; \tag{C.129}
$$

$$\left\| \langle \Psi, \Phi \rangle\_{A} \right\| \leq \left\| \Psi \right\| \left\| \Psi \right\|; \tag{C.130}$$

ψ*a*≤ψ *a*. (C.131)

*in which the following expression (which duly defines a norm on E) occurs:*

$$\left\| \left\| \Psi \right\| \right\| = \left\| \langle \Psi, \Psi \rangle\_{A} \right\|^{1/2}. \tag{C.132}$$

*Proof.* To prove (C.129), we assume ϕ = 0 (otherwise, the claim clearly holds), so that also ϕ > 0 by (C.127) and (C.132). Replacing ϕ by ϕ/ϕ if necessary, (i.e., if ϕ = 1), it is then enough to show that whenever ϕ = 1, we have

$$
\langle \Psi, \Phi \rangle\_A \langle \Psi, \Psi \rangle\_A \le \langle \Psi, \Psi \rangle\_A. \tag{C.133}
$$

To this effect, we substitute ϕϕ,ψ*A*−ψ for ψ in (C.126) and use (C.128), (C.124), and (C.125), and (C.93), the latter in form *b*∗*cb* ≤ *cb*∗*b* for any *b* and *c* ≥ 0 in *A*. This gives (C.129). Eqs. (C.2), (C.124), and (C.129) then imply (C.130). Eq. (C.131) follows from (C.128), (C.93), (C.84), and (C.2).

Finally, (C.132) defines a norm: scaling is clear, positive definiteness follows from (C.127), and the triangle inequality is easily derived from (C.130). -

Corollary C.67. *The inner product on a pre-Hilbert C\*-module is nondegenerate, in that* ψ = 0 *iff* ψ,ϕ*<sup>B</sup>* = 0 *for all* ϕ ∈ *E.*

*Proof.* It follows from (C.129) that for any ψ ∈ *E*, we have

$$\|\|\boldsymbol{\Psi}\|\| = \sup\{ \|\langle \boldsymbol{\Psi}, \boldsymbol{\Phi} \rangle\_{\boldsymbol{B}}\|, \boldsymbol{\Phi} \in E, \|\|\boldsymbol{\Phi}\|\| = 1 \}. \tag{C.134}$$

. We now come to the main definition.

Definition C.68. *A* Hilbert C\*-module *over A is a pre-Hilbert C\*-module over A that is complete in the norm* (C.132)*. We also say that E is a* Hilbert *A*-module*.*

The three most straightforward examples of this concept, written "*E A*", are:

• C\*-algebras themselves: *E* = *A* with action (*a*,*b*) → *ab* and inner product

$$
\langle a, b \rangle\_{\mathcal{A}} = a^\* b. \tag{C.135}
$$

By (C.2), the norm in *E* defined by (C.132) coincides with the original norm.


$$\langle \Psi, \mathfrak{q} \rangle\_{C(X)} = \mathfrak{x} \mapsto <\!/\psi(\mathfrak{x}), \mathfrak{q}(\mathfrak{x}) >\_{\mathbb{S}\_{\mathfrak{x}}} \mathfrak{.} \tag{\text{C.136}}$$

This implies a norm ψ <sup>=</sup> sup{ψ(*x*)E*<sup>x</sup>* , *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*}, where *v*<sup>2</sup> <sup>E</sup>*<sup>x</sup>* =< *v*, *v* >E*<sup>x</sup>* .

A Hilbert C\*-module *E A* defines a C\*-algebra *C*∗(*E*,*A*) that consists of all maps *a* : *E* → *E* for which there exists a map *a*<sup>∗</sup> : *E* → *E* such that for all ψ,ϕ ∈ *E*,

$$
\langle \Psi, a\Phi \rangle\_A = \langle a^\* \Psi, \Phi \rangle\_A. \tag{C.137}
$$

Such maps are called *adjointable*. For example, if *E* = *A*, as in the first example above, then any element *a* ∈ *A* defines an adjointable map simply by left multiplication (i.e., *a*(*b*) = *ab*). If *A* has a unit, then this is it, whereas in the nonunital case there are (many) more adjointable maps on *A A*.

We now show that adjointable maps on a Hilbert C\*-module form a C\*-algebra.


$$
\langle a\Psi, a\Psi\rangle\_A \le ||a||^2 \langle \Psi, \Psi\rangle\_A. \tag{C.138}
$$

*Proof.* The property of C-linearity is obvious, whereas *A*-linearity follows from (C.128): this gives *a*(ψ*b*),ϕ*<sup>A</sup>* = *a*(ψ)*b*,ϕ*A*, upon which Corollary C.67 yields the claim. A similar argument shows that *a*<sup>∗</sup> ∈ *C*∗(*E*,*A*) when *a* ∈ *C*∗(*E*,*A*).

To prove boundedness, fix ψ ∈ *E* and *a* ∈ *C*∗(*E*,*A*), and define *T*<sup>ψ</sup> : *E* → *A* by *T*ψϕ = *a*∗*a*ψ,ϕ*A*. It is clear from (C.130) that *T*ψ≤*a*∗*a*ψ, so that *T*<sup>ψ</sup> is bounded. On the other hand, since *a* is adjointable, one has *T*ψϕ = ψ,*a*∗*a*ϕ*A*, so that, using (C.130) once again, one has *T*ψϕ≤*a*∗*a*ϕ ψ. Since *E* is complete we may apply the Banach–Steinhaus Theorem B.78, which gives

$$\sup\{\|T\_{\Psi}\|, \Psi \in E, \|\Psi\| = 1\} < \infty. \tag{C.139}$$

It then follows from (C.132) that *a* < ∞. Uniqueness and involutivity of the adjoint are proved as for Hilbert spaces; the former follows from (C.127), the latter in addition requires (C.124). The space *C*∗(*E*,*A*) is norm-closed, since one easily verifies from (C.137) and (C.132) that if *an* → *a*, then *a*<sup>∗</sup> *<sup>n</sup>* converges to *a*∗. As a norm-closed space of linear maps on a Banach space, *C*∗(*E*,*A*) is a Banach algebra, so that its satisfies (C.1). To check (C.2), one infers from (C.132) and the definition (C.137) of the adjoint that *a*<sup>2</sup> ≤ *a*∗*a*; using (C.1) and the argument leading to (A.22), one first obtains *a*∗ <sup>=</sup> *a*, and subsequently *a*∗*a* <sup>=</sup> *a*2.

Finally, it follows from (C.126), (C.86), and (C.137) that for fixed ψ ∈ *E*, the map *a* → ψ,*a*ψ*<sup>A</sup>* from *C*∗(*E*,*A*) to *A* is positive. Replacing *a* by *a*∗*a* in (C.83) and using (C.2) and (C.137) then leads to (C.138). -

In our first example the C\*-algebra *C*∗(*A*,*A*) is usually called the *multiplier algebra*, denoted by *M*(*A*). If *A* has a unit, then *M*(*A*) = *A*, but in general *M*(*A*) is much larger than *A*, and obviously it always has a unit (given by the unit operator on *A*).

Proposition C.70. *For any commutative C\*-algebra A we have an isomorphism*

$$\mathcal{M}(A) \stackrel{\cong}{\rightarrow} \mathcal{C}\_b(\Sigma(A));\tag{C.140}$$

$$a \mapsto \hat{a},\tag{C.141}$$

*where, in terms of the Gelfand isomorphisms A* <sup>∼</sup><sup>=</sup> *<sup>C</sup>*0(Σ(*A*))*, f* → <sup>ˆ</sup>*f , we have*

$$
\dot{a}(f) = \hat{a}\hat{f}.\tag{C.142}
$$

*In particular, for any locally compact space X we have an isomorphism*

$$M(C\_0(X)) \cong C\_b(X),\tag{C.143}$$

*where a* ∈ *Cb*(*X*) *simply acts on f* ∈ *C*0(*X*) *by a*(*f*) = *af.*

*Proof.* If *A* is commutative, then by Theorem C.69.1, any *a* ∈ *M*(*A*) satisfies

$$a(fg) = a(f)g = fa(g), \ f, g \in A. \tag{C.144}$$

For any *f*,*g* ∈ *A* and ω ∈ Σ(*A*) such that ω(*f*) = 0 and ω(*g*) = 0, the second equality in (C.144) gives ω(*a*(*f*))/ω(*f*) = ω(*a*(*g*))/ω(*g*). Since ω = 0, there is at least one *f* ∈ *A* for which ω(*f*) = 0, so that the function ˆ*a* : Σ(*A*) → C given by

$$\hat{a}(\mathfrak{o}) = \frac{\mathfrak{o}(a(f))}{\mathfrak{o}(f)} = \frac{\hat{a}(f)(\mathfrak{o})}{\hat{f}(\mathfrak{o})},\tag{C.145}$$

is well defined. Thus (C.142) holds by construction. Since *a*(*f*) ∈ *A*, continuity of the Gelfand transform makes ˆ*a* continuous. Next, we estimate

$$|\hat{a}(\mathfrak{a})\hat{f}(\mathfrak{a})| = |\widehat{\overline{a(f)}}(\mathfrak{a})| \le \|\widehat{\overline{a(f)}}\|\_{\mathfrak{a}} = \|\widehat{a(f)}\| \le \|\mathfrak{a}\|\|f\|,\tag{C.146}$$

where we used (C.145) and isometry of the Gelfand transform, cf. (C.18). Hence

$$|\left|\hat{a}(\mathfrak{o})\right| = \left|\frac{\hat{a}(\mathfrak{o})\hat{f}(\mathfrak{o})}{\hat{f}(\mathfrak{o})}\right| \le \frac{||a||}{|\hat{f}(\mathfrak{o})|},\tag{C.147}$$

for any *f* ∈ *A*, and ω ∈ Σ(*A*) for which ω(*f*) = 0 and *f* = 1. For those, we have

$$\begin{aligned} \inf \{ |\hat{f}(\mathfrak{o})|^{-1} \mid \mathfrak{o} \in \Sigma(A), \mathfrak{o}(f) \neq 0, \|f\| = 1 \} &= \\ \sup \{ |\hat{f}(\mathfrak{o})| \mid \mathfrak{o} \in \Sigma(A), \mathfrak{o}(f) \neq 0, \|f\| = 1 \} &= \|\hat{f}\|\_{\infty}^{-1} = 1, \end{aligned} \tag{C.148}$$

again using *<sup>f</sup>* <sup>=</sup> <sup>ˆ</sup>*<sup>f</sup>* ∞. Together with (C.147), this gives <sup>|</sup>*a*ˆ(ω)|≤*a*, and hence

$$\|\hat{a}\|\_{\infty} \le \|a\|. \tag{C.149}$$

In particular, ˆ*a* is bounded, so that the map (C.140) - (C.141) is well defined. This map has an inverse, as clearly any function ˆ*a* ∈ *Cb*(Σ(*A*)) defines an element of *M*(*C*0(Σ(*A*))) by multiplication, and hence defines an element *a* ∈ *M*(*A*) by the inverse Gelfand transform, cf. (C.142). -

Since an isomorphism of C\*-algebras is isometric, we have *a*ˆ<sup>∞</sup> = *a*. This may also be proved directly from (C.149) and the converse inequality

$$\begin{split} \|\|a\|\| = \sup\{\|a(f)\|\| \mid f \in A, \|f\|\| = 1\} &= \sup\{\|\|\hat{a}(\hat{f})\|\|\_{\infty} \mid f \in A, \|\hat{f}\|\_{\infty} = 1\} \\ &= \sup\{\|\|\hat{a}\hat{f}\|\|\_{\infty} \mid f \in A, \|\hat{f}\|\_{\infty} = 1\} \le \|\hat{a}\|\_{\infty} .\end{split} \tag{C.150}$$

Most of this argument also works for the pre-Hilbert *C*0(*X*) module *E* = *Cc*(*X*) (whose completion is *C*0(*X*), of course), except for the inequality (C.149), which relies on boundedness of *a* (cf. Theorem C.69). This is lost if *E* fails to be complete, and we now merely obtain an isomorphism of algebras with involution:

$$M(\mathbb{C}\_c(X)) \cong \mathbb{C}(X). \tag{\text{C.151}}$$

For a slightly different take on this, for a general C\*-algebra *A* we define an *unbounded multiplier* on *A* (seen as a Hilbert *A*-module) as a closed C-linear and *A*-linear map *m* : *D*(*m*) → *A*, where *D*(*m*) is a dense right-ideal in *A* (in the algebraic sense, i.e., by exception we do *not* require an ideal to be closed). In general, the set *UM*(*A*) of all unbounded multipliers on *A* has little algebraic structure (like the set of all closed operators on a Hilbert space), but in the commutative case we have

$$UM(\mathsf{C}\_{0}(X)) \cong C(X),\tag{\mathsf{C}.152}$$

under the same identification as in (C.143). This means that any unbounded multiplier on *C*0(*X*) takes the form *g* → *f g* for some *f* ∈ *C*(*X*), with domain

$$D(f) = \{ g \in \mathcal{C}\_0(X) \mid fg \in \mathcal{C}\_0(X) \}. \tag{C.153}$$

The argument is the same as in the proof of Proposition C.70 (except for boundedness), adding that fact that *Cc*(*X*) is a core for each *f* , in that its closure (defined as usual by the set of all *g* ∈ *C*0(*X*) for which there is a sequence (*gn*) in *Cc*(*X*) such that *gn* → *g* and *f gn* is Cauchy) is given by *D*(*f*); then *f gn* → *f g* (in the sup-norm).

Let us return to the bounded case, concentrating on the multiplier algebra

$$M(A) = C^\*(A, A). \tag{C.154}$$

Proposition C.71. *There is an inclusion A* → *M*(*A*)*, where A (seen as a subspace of B*(*A*)*) acts on A (seen as a Hilbert A-module) by left multiplication. Moreover, A is an* essential ideal *in M*(*A*)*, in having nonzero intersection with any other ideal.*

*Proof.* We first note that each map *La* : *b* → *ab* (*a*,*b* ∈ *A*) is adjointable, because

$$<\langle c, L\_a(b) \rangle\_A = \langle c, ab \rangle\_A = c^\*ab = (a^\*c)^\*b = \langle a^\*c, b \rangle\_A = \langle L\_{a^\*}(c), b \rangle\_A,$$

so that the adjoint of *La* is *La*<sup>∗</sup> . Furthermore, *La* = 0 iff *a* = 0, as can be seen by taking an approximate unit in *A*, or from Lemma C.47. Hence *A* ⊂ *M*(*A*), which is a proper inclusion iff *A* has no unit (since *M*(*A*) always has one, i.e. the unit of *B*(*A*)).

Now let *m* ∈ *M*(*A*) and *a* ∈ *A*. Then (*m* ◦ *a*)(*b*) = *m*(*ab*) = *m*(*a*)*b*, since *m* ∈ *C*∗(*A*,*A*) is *A*-linear. Hence *ma* ≡ *m* ◦ *a* ∈ *A*, since *m*(*a*) ∈ *A*. Since *am* = (*m*∗*a*∗)∗, this argument shows that also *am* ∈ *A*, making *A* an ideal in *M*(*A*).

To see that this ideal is essential, we note (as a little exercise) that an ideal *J* ⊂ *B* in a C\*-algebra *B* is essential iff *bJ* = 0 (i.e., *b j* = 0 for each *j* ∈ *J* and some *b* ∈ *B*) implies *b* = 0. Again by Lemma C.47, if *m*(*ja*) = 0 for each *j* ∈ *A*, *a* ∈ *A*, and some *b* ∈ *M*(*A*), then *b*(*c*) = 0 for each *c* ∈ *A*, and hence *c* = 0. -

In general, one may compute *M*(*A*) as follows. If *A* and *B* are C\*-algebras and *E* is a Hilbert *A*-module, we say that a homomorphism α : *B* → *C*∗(*E*,*A*) is *nondegenerate* if α(*B*)*E*− = *E*, that is, if the closed linear span of all vectors of the type α(*b*)ψ, where *b* ∈ *B* and ψ ∈ *E*, equals *E*. It can be shown (from the Cohen–Hewitt factorization theorem) that in this case one needs neither the linear span nor the closure to recover *E*, in that each each element of *E* literally factorizes:

$$E = \{ \alpha(b)\Psi \mid b \in B, \Psi \in E \}. \tag{C.155}$$

Theorem C.72. *Suppose A and B are C\*-algebras, E is a Hilbert A-module, and*

$$\alpha: B \to C^\*(E, A)$$

*is a nondegenerate homomorphism. If B is an ideal in a C\*-algebra C, then* α *has a unique extension to C (which is injective if B is essential in C and* α *is injective).*

*Proof.* The idea is easy: write ϕ ∈ *E* as ϕ = α(*b*)ψ for some *b* ∈ *B* and ψ ∈ *E*, cf. (C.155), and define the desired extension

$$
\tilde{\alpha}: \mathbb{C} \to \mathbb{C}^\*(E, A) \tag{\text{C.156}}
$$

by

$$
\tilde{\alpha}(c)\varphi = \alpha(cb)\Psi,\tag{C.157}
$$

provided this is well defined (in which case α˜ is clearly uniquely determined by α). Adjointability then also follows, since we may define α˜(*c*)<sup>∗</sup> = α˜(*c*∗), and compute

$$
\begin{split}
\langle \tilde{\alpha}(c)^{\*}\alpha(b')\Psi', \alpha(b)\Psi\rangle\_{B} &= \langle \alpha(c^{\*}b')\Psi', \alpha(b)\Psi\rangle\_{B} = \langle \Psi', \alpha(c^{\*}b')^{\*}\alpha(b)\Psi\rangle\_{B} \\ &= \langle \Psi', \alpha(b')^{\*}\alpha(cb)\Psi\rangle\_{B} \\ &= \langle \alpha(b')\Psi', \tilde{\alpha}(c)\alpha(b)\Psi\rangle\_{B}.
\end{split}
\tag{C.158}
$$

Furthermore, it is easy to see that α˜ is a homomorphism. Also, α(*c*) = 0 for *c* ∈ *C* implies α(*cb*) = 0 for each *b* ∈ *B*; if α is injective, then *cb* = 0, and if *B* is an essential ideal in *C*, then *c* = 0, so that α˜ is injective.

To show that (C.157) is independent of the representatives *b* and ψ, we estimate

$$\begin{split} \|\mathfrak{A}(c)\mathfrak{a}(b)\mathfrak{y}\| &= \lim\_{\lambda} \|\mathfrak{a}(ce\_{\lambda}b)\mathfrak{y}\| = \lim\_{\lambda} \|\mathfrak{a}(ce\_{\lambda})\mathfrak{a}(b)\mathfrak{y}\| \\ &\le \lim\_{\lambda} \|\mathfrak{a}(ce\_{\lambda})\| \|\mathfrak{a}(b)\mathfrak{y}\| \le \lim\_{\lambda} \|ce\_{\lambda}\| \|\mathfrak{a}(b)\mathfrak{y}\| \\ &= \|c\| \|\|\|\mathfrak{a}(b)\mathfrak{y}\|, \end{split} \tag{C.159}$$

where (*e*<sup>λ</sup> ) is an approximate unit in *C*. In particular, if α(*b*)ψ = α(*b* )ψ , then α˜(*c*)α(*b*)ψ = α˜(*c*)α(*b* )ψ . -

This proof works also without (C.155); one then has a finite sum ϕ = ∑*<sup>i</sup>* α(*bi*)ψ*i*, and a computation similar one to the previous one shows that α˜(*c*) is bounded on the dense subspace of *E* consisting of such sums.

This theorem (with *B A* and *E A*) explains in which sense *M*(*A*) is a *maximal* unitization of *A* (whereas *A*˙ is a *minimal* one): all we need to do is abstractly define a unitization of a non-unital C\*-algebra *A* as a unital C\*-algebra containing *A* as an essential ideal (cf. Proposition C.71). This incorporates both *A*˙ and *M*(*A*), each being distinguished by a universal property it satisfies, namely:

Corollary C.73. *For each unital C\*-algebra C containing A as an essential ideal, there are unique injective homomorphisms: C* <sup>→</sup> *<sup>M</sup>*(*A*) *and <sup>A</sup>*˙ <sup>→</sup>*C whose restriction to A is the identity map. In other words, denoting the inclusion of A into C by* ι*, we have commutative diagrams*

The topological counterpart of this corollary is the construction of the one-point compactification *X*˙ and of the Cech-Stone compactification ˇ β*X*, respectively; cf. Lemma C.38, which we may now supplement by simply *defining* β*X* as the Gelfand spectrum of the commutative C\*-algebra *Cb*(*X*) ∼= *C*(β*X*). In this analogy, the condition on an ideal *B* ⊂ *C* to be essential simply corresponds to a non-compact space *X* being a dense subspace of some compactification of it.

Corollary C.74. *Let E be some Hilbert A-module E and let* α : *B* →*C*∗(*E*,*A*) *be an injective nondegenerate homomorphism. The unique extension a*˜ : *M*(*B*) →*C*∗(*E*,*A*) *of* α *that exists according to Theorem C.72 maps M*(*B*) *isomorphically onto*

$$Z\_{\mathfrak{a}}(E) = \{ a \in C^\*(E, A) \mid a\mathfrak{a}(b) \in \mathfrak{a}(B), \mathfrak{a}(b)a \in \mathfrak{a}(B) \,\forall b \in B \}.\tag{C.160}$$

*Proof.* Note that *Z*α(*E*) is essential in *C*∗(*E*,*A*), as easily follows from the nondegeneracy of α. Therefore, by the argument just given (plus the abstract nonsense that shows that universal objects are unique up to isomorphism), we only need to prove that *Z*α(*E*) is a maximal unitization of *B*. Let *B* be an essential ideal in *C* and consider the injective extension ˜*a* : *C* → *C*∗(*E*,*A*) of α given by Theorem C.72. Then ˜*a* maps *C* into *Z*α(*E*) by construction, as α˜(*c*)α(*b*) = α(*bc*) ∈ α(*B*), etc. -

Corollary C.75. *A nondegenerate homomorphism* α : *B* → *M*(*A*) *has a unique extension to a homomorphism* α˜ : *M*(*B*) → *M*(*A*)*.*

*Proof.* Take *C* = *M*(*B*) and *E* = *A* in Theorem C.72.1. -

Note that two nondegenerate homomorphisms α : *A* → *M*(*B*) and β : *B* → *M*(*C*) can be composed into a nondegenerate homomorphism β ◦ α : *A* → *M*(*C*), which by definition equals β ◦α˜ . Thus one obtains a category CAm whose objects are C\* algebras and whose arrows are nondegenerate homomorphism α : *A* → *M*(*B*), with a full subcategory CCAm whose objects are commutative C\*-algebras (with the same arrows). This leads to a neat extension of Gelfand duality (cf. Theorem C.45):

Theorem C.76. *The category* LCH *of locally compact Hausdorff spaces and continuous maps is dual to the category* CCAm *of commutative C\*-algebras just defined.*

This claim may be unfolded as in Theorem C.45, omitting 'proper' on the topological side and replacing α : *A* → *B* on the algebraic side by α : *A* → *M*(*B*).

*Proof.* First, a continuous map ϕ : *Y* → *X* trivially induces a nondegenerate homomorphism ϕ<sup>∗</sup> : *C*0(*X*) → *Cb*(*Y*). Second, since ω ∈ Σ(*B*) defines a nondegenerate homomorphism *B* → C, by Theorem C.72 it extends to a homomorphism ω˜ : *M*(*B*) → C. Thus the pullback α<sup>∗</sup> : Σ(*B*) → Σ(*A*) of a nondegenerate homomorphism α : *A* → *M*(*B*) is well defined (and still continuous). Part 3 of Theorem C.45 stays the same, and the pertinent naturality properties are easily verified. -

Corollary C.77. *A nondegenerate homomorphism* α : *C*0(*X*) → *B*(*H*) *has a unique extension to a homomorphism* α˜ : *Cb*(*X*) → *B*(*H*)*.*

*Proof.* Taking *A* = C, *E* = *H*, and *B* = *B*0(*H*), Theorem C.72.2 gives

$$M(B\_0(H)) \cong B(H). \tag{C.161}$$

Combine this with the previous corollary (with *B C*0(*X*) and *A B*0(*H*)). -

Finally, we show how to reconstruct *A* as a C\*-algebra from *A* as a Hilbert *A*module. The key to this is a more general construction:

Definition C.78. *The collection C*∗ <sup>0</sup> (*E*,*A*) *of "compact" operators on a Hilbert Amodule E is the C\*-algebra generated (within C*∗(*E*,*A*)*) by all operators of the type* |ϕψ|*, where* ϕ,ψ ∈ *E, and*

$$|\!\langle \boldsymbol{\phi} \rangle \langle \!\boldsymbol{\psi} \vert (\!\zeta \rangle = \!\boldsymbol{\varphi} \langle \!\boldsymbol{\psi} \!\/, \!\zeta \rangle\_{A}. \tag{C.162}$$

Such operators are easily seen to be adjointable, with adjoint

$$|\mathfrak{g}\rangle\langle\mathfrak{y}|^\* = |\mathfrak{y}\rangle\langle\mathfrak{g}|,\tag{\text{C.163}}$$

and hence bounded, with norm majorized by ψϕ. If *E* = *H* is a Hilbert space, then *C*∗ <sup>0</sup> (*H*,C) = *B*0(*H*), since the maps |ϕψ| obviously generate the finite-rank operators on *H*, whose norm-closure is *B*0(*H*), cf. Proposition B.131. Hence the name "compact" operators, but in general elements of*C*∗ <sup>0</sup> (*E*,*A*) need not be compact (as operators on a Banach space) at all. The next and final example is a case in point:

Proposition C.79. *If E* = *A as a Hilbert A-module in the usual way, then*

$$C\_0^\*(A, A) \cong A.\tag{C.164}$$

*Proof.* We have |*ab*| = *Lab*<sup>∗</sup> , where *a* → *La* is the canonical map from *A* to *C*∗(*A*,*A*) ⊂ *B*(*A*) given by *La*(*b*) = *ab*, see Proposition C.30. This map is isometric, cf. (C.63), and hence injective. The map |*ab*| → *ab*<sup>∗</sup> from the linear span of all operators (C.162) within *C*∗ <sup>0</sup> (*E*,*A*) to *A* is therefore bounded, and has dense image by Lemma C.47. Its unique continuous extension maps *C*∗ <sup>0</sup> (*E*,*A*) onto *A*, see Theorem C.62.5 (or use the Cohen–Hewitt factorization theorem to conclude). -

#### C.11 Gelfand topology as a frame

In the traditional approach to the Gelfand isomorphism, which we have followed so far, the Gelfand spectrum Σ(*A*) of a commutative unital C\*-algebra *A* is first constructed as a set, upon which it is equipped with a natural topology O(Σ(*A*)), i.e., the Gelfand topology. Alternatively, one may start with the latter and reconstruct Σ(*A*) as a set from it. This not only gives a better conceptual understanding of Gelfand's theory (relating it, for example, to a well-known construction in algebraic geometry); it also has the technical advantage of making good sense in constructive mathematics and hence in topos theory (which the classical theory does not).

In the language of lattice theory, the topology O(*X*) of any space *X* is an example of a so-called *frame* (cf. Appendix D, compared to which we change notation so as to avoid abuse of the ubiquitous symbol *X*) i.e., a complete lattice *L* in which

$$U \wedge \bigvee S = \bigvee \{ U \wedge V, V \in S \},\tag{C.165}$$

for arbitrary elements *U* ∈ *L* and subsets *S* ⊂ *L*. This is sometimes written in the form *U* ∧( <sup>λ</sup> *V*<sup>λ</sup> ) = <sup>λ</sup> (*U* ∧*V*<sup>λ</sup> ), from which it is clear that the (binary) distributive law *U* ∧(*V* ∨*W*)=(*U* ∧*V*)∨(*U* ∧*W*), which of course is implied by (C.165), is now required for arbitrary families. Indeed, the definition of a frame is primarily motivated by the example *L* = O(*X*), in which it should be noted that the supremum

$$\bigvee S = \bigcup S \equiv \bigcup\_{\lambda} \{ U\_{\lambda} \in S \},\tag{C.166}$$

is simply given by the set-theoretic union of the elements of *S*, which are open sets whose union is open by definition of a topology, whereas the infimum of arbitrary families of open sets has to be doctored so as to make it open, and hence is given by

$$\bigwedge \mathcal{S} = \bigvee \{ U \in \mathcal{O}(X) \mid U \subseteq V \,\forall V \in \mathcal{S} \}. \tag{C.167}$$

*Frame maps*, then, are defined as order-preserving maps between the underlying posets that preserve *finite* infima and *arbitrary* joins. For example, if

$$
\mathfrak{sp}: Y \to X \tag{C.168}
$$

is a continuous map, then the inverse image map

$$\varphi^{-1}: \mathcal{O}(X) \to \mathcal{O}(Y) \tag{C.169}$$

is a frame map. This also defines the category Frm of frames, whose opposite category (that has the same objects but all arrows inverted) is called the category Loc of *locales*. Thus a locale is a frame, seen as an object in the opposite category. If no confusion arises (which, unfortunately, is rarely the case), elements of Frm are written as O(*X*), *even if they are not topologies* (and indeed there are such frames, see below), in which case the corresponding element of Loc is written as *X*.

In this spirit, frame maps are always written as (C.169), in which case the map in the opposite direction between the corresponding locales is (C.168). This notation suggests the right way of thinking, and we will use it whenever it is convenient.

Frames are very closely related to *Heyting algebras*, which were originally meant to formalize the intuitionistic (propositional) logic of Brouwer, and are defined as distributive lattices *L* (with top # and bottom ⊥) equipped with a binary map

$$
\to : L \times L \to L,\tag{C.170}
$$

playing the role of implication in logic, that satisfies the axiom

$$U \le (V \to W) \text{ iff } (U \wedge V) \le W. \tag{C.171}$$

Every Boolean algebra is a Heyting algebra, but not *vice versa*; in fact, a Heyting algebra is Boolean iff ¬¬*U* =*U* for all *U*, which is the case iff (¬*U*)∨*U* = # for all *U* (which states the law of the excluded middle denied by Brouwer). In a Heyting algebra (unlike a Boolean algebra), negation is a derived notion, defined by

$$
\neg U = U \to \bot. \tag{C.172}
$$

A Heyting algebra is *complete* when it is complete as a lattice, in that arbitrary suprema (and hence also infima) exist. The infinite distributivity law (C.165) is automatically satisfied in a complete Heyting algebra, which therefore is also a frame. Conversely, a frame may be turned into a complete Heyting algebra by defining

$$V \to W = \bigvee \{ U \mid U \wedge V \le W \}. \tag{C.173}$$

Frames and complete Heyting algebras drift apart as soon as morphisms are concerned, for although in both cases one requires maps to preserve the partial order, maps between Heyting algebras must preserve → rather than infinite suprema.

The map *X* → O(*X*) from topological spaces to frames (which extends to a contravariant functor in the obvious way, i.e., via (C.168) - (C.169)) is a competitor to the map *X* → *C*0(*X*) from topological spaces to commutative C\*-algebras, and one goal of this section is to find out how these two constructions are related.

First, there is a frame-theoretic analogue of the categorical duality between locally compact Hausdorff spaces and commutative C\*-algebras (cf. Theorem C.45), in which locally compact Hausdorff spaces are replaced by so-called *sober* spaces (and no restrictions on continuous maps are made), whilst the category of frames must be restricted to so-called *spatial* frames (which move is somewhat analogous to restricting C\*-algebras to commutative ones). We now explain these notions.

A particularly simple frame is 2 = {0,1} ≡ {⊥,#}, with order 0 ≤ 1; this is just the topology O(∗) of a singleton ∗. In agreement of the above convention, a frame map *<sup>p</sup>*−<sup>1</sup> : <sup>O</sup>(*X*) <sup>→</sup> <sup>2</sup> will be written as a locale map *<sup>p</sup>* : ∗ → *<sup>X</sup>*. Such a map defines a *point* of the locale *X* (i.e., of the frame O(*X*)), and we denote the set of points of *X* by Pt(*X*). To appreciate this definition, let us suppose that O(*X*) is the topology of some space *X*. Each point *x* ∈ *X* then corresponds to a *genuine* map

$$p\_{\times}: \* \to X, \ p\_{\times}(\*) = \text{x};\tag{C.174}$$

whose inverse image map *p*−<sup>1</sup> *<sup>x</sup>* : O(*X*) → 2 is frame map and hence defines a point in the above sense. Conversely, if *X* is sober (see below), each point of O(*X*) arise in that way. The set Pt(*X*) has a natural topology, with opens

$$\text{Pt}(U) = \{ p \in \text{Pt}(X) \mid p(\*) \in U \},\tag{C.175}$$

where *<sup>U</sup>* <sup>∈</sup> <sup>O</sup>(*X*); here *<sup>p</sup>*(∗) <sup>∈</sup> *<sup>U</sup>* really means *<sup>p</sup>*−1(*U*) = 1. This gives a frame map

$$U \mapsto \text{Pt}(U) \tag{C.176}$$

from O(*X*) to Pt(*X*). We say O(*X*) (or the locale *X*) is *spatialspatial* if this map is an isomorphism of frames. Roughly speaking, therefore, spatial frames are just topologies (an example of a non-spatial frame is the lattice Oreg(R) of regular open subsets of R, i.e., of open subsets *U* with the property ¬¬*U* = *U*, where ¬*U* is the interior of the complement of *U*). This does not mean, however, that any topology O(*X*) (seen as a frame) is isomorphic to O(Pt(*X*)), since Pt(*X*) may not be homeomorphic to *X*.

Spaces *X* for which this *is* the case are called *sober*; more precisely, this means that the map *x* → *px* from *X* to Pt(*X*) considered above is a homeomorphism; less precisely, we may say that sober spaces *X* may be reconstructed from their topology O(*X*), up to homeomorphism. To give a more direct topological characterization of sobriety, call *W* ∈ O(*X*) *meet-irreducible* if *U* ∩*V* ⊆ *W* (where *U*,*V* ∈ O(*X*)) implies either *U* ⊆*W* or *V* ⊆*W*. In any space *X*, all open sets of the form *Wx* = *X*\*x*<sup>−</sup> are meet-irreducible, where *x* ∈ *X* (and *x*<sup>−</sup> is the closure of {*x*}). A space *X* is sober, then, iff these are the only such opens. For example, any Hausdorff space is sober (an example of a non-sober space is *X* = N with the unusual topology in which all complements of finite subsets are open, along with the empty set, of course).

The category Frm, then, has a full subcategory Spat of spatial frames, whilst likewise the category Top of topological spaces has a full subcategory Sob of sober spaces. We now have the following counterpart of Theorem C.45:

#### Theorem C.80. *The categories* Spat *and* Sob *are dual, in that:*


$$\mathsf{p}\_{\mathsf{X}} : X \stackrel{\cong}{\leftrightarrow} \mathsf{Pt}(\mathcal{O}(X)), \ \mathsf{x} \mapsto p\_{\mathsf{x}}; \tag{\mathsf{C.177}}$$

$$\text{Pt}\_X: \mathcal{O}(X) \overset{\cong}{\leftrightarrow} \mathcal{O}(\text{Pt}(\mathcal{O}(X))), \ U \mapsto \text{Pt}(U), \tag{C.178}$$

#### *cf.* (C.174) *-* (C.176)*, with the correct naturality properties (cf. Theorem C.23).*

*Proof.* We will not give a complete proof of this, but the main points are that:


Our aim is to apply these ideas to Gelfand duality, specifically to an independent description of the topology O(Σ(*A*)) of the Gelfand spectrum Σ(*A*) of a commutative C\*-algebra *A*. To put this in perspective, let *A* for the moment be a general C\*-algebra, and recall Definition C.55 of left, right and two-sided ideals (all taken to be closed by definition). Further to these, there is another interesting notion.

Definition C.81. *A* hereditary subalgebra *of a C\*-algebra A is a C\*-subalgebra B of A with the property that a* <sup>≤</sup> *b for b* <sup>∈</sup> *<sup>B</sup>*<sup>+</sup> *and a* <sup>∈</sup> *<sup>A</sup>*<sup>+</sup> *implies a* <sup>∈</sup> *<sup>B</sup>*+*. The set of of all hereditary subalgebras of A is denoted by H*(*A*)*.*

It is a simple exercise to show that there are bijective correspondences between hereditary subalgebras *B* of *A*, left ideals *L* of *A*, and right ideals *R* of *A*, given by:

$$L = \{ a \in A \mid a^\*a \in \mathcal{B}^+ \};\tag{C.179}$$

$$\mathcal{R} = \{a \in A \mid aa^\* \in \mathcal{B}^+\};\tag{C.180}$$

$$B = L \cap L^\* = R \cap R^\*. \tag{C.181}$$

Furthermore, one has *I*(*A*) ⊆ *H*(*A*), where *I*(*A*) is the set of closed two-sided ideals in *A*, and likewise we write *L*(*A*) and *R*(*A*). If *A* is commutative, these ideals are two-sided, so that *L*∗ = *L* etc., and *L* = *R* = *B*, so that *H*(*A*) = *I*(*A*) = *L*(*A*) = *R*(*A*).

Proposition C.82. *The set H*(*A*) *is a complete lattice under inclusion as the partial order, with inf and sup of any subset S* ⊂ *H*(*A*) *given by*

$$\bigvee \mathbb{Z} = \bigvee \mathbb{Z};\tag{\text{C.182}}$$

$$\bigvee S = \bigcap \{ U \in H(\mathcal{A}) \mid V \subseteq U \,\forall V \in \mathcal{S} \}. \tag{C.183}$$

*Moreover, if A is commutative, then H*(*A*) = *I*(*A*) = *L*(*A*) = *R*(*A*) *is a frame.*

*Proof.* The defining conditions on hereditary subalgebras of *A* are preserved by arbitrary intersections, which means that *H*(*A*) has infima of arbitrary subsets, given by (C.182). This implies that *H*(*A*) also has arbitrary suprema, given by (C.183), which is a standard formula in lattice theory. Hence *H*(*A*) is a complete lattice.

The last claim follows from Corollary C.84 below (and the ensuing fact for topology). It may also be proved directly, using the fact that *H*(*A*) = *I*(*A*). - Proposition C.83. *Let X be a locally compact Hausdorff space. Then the map*

$$\mathcal{O}(X) \stackrel{\cong}{\to} H(\mathcal{C}\_0(X));\tag{\text{C.184}}$$

$$U \mapsto \mathcal{C}\_0(U),\tag{C.185}$$

*where C*0(*U*) *is seen as a subspace of C*0(*X*)*, is a frame isomorphism, with inverse*

$$H(\mathsf{C}\_{0}(X)) \xrightarrow{\cong} \mathcal{O}(X);\tag{\mathsf{C}.186}$$

$$B \mapsto X \backslash F\_B,\tag{C.187}$$

*where, for any subset B* ⊂ *C*0(*X*) *one defines the (necessarily closed) set FB* ⊂ *X by*

$$F\_B = \{ \mathbf{x} \in X \mid f(\mathbf{x}) = \mathbf{0} \,\forall f \in B \}. \tag{\text{C.188}}$$

*Proof.* For any open *U* ∈ O(*X*), we may regard *f* ∈ *C*0(*U*) as an element of *C*0(*X*) by extending *f* to all of *X* through *f*|*X*\*<sup>U</sup>* = 0. Continuity of *f* is only an issue at boundary points of *<sup>U</sup><sup>c</sup>* <sup>≡</sup> *<sup>X</sup>*\*U*, so take *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> <sup>∂</sup>*U<sup>c</sup>* (i.e., any neighbourhood of *<sup>x</sup>*<sup>0</sup> has nonempty intersection with both *U<sup>c</sup>* and *U*). Since *f*(*x*0) = 0, to prove continuity of *f* at *x*<sup>0</sup> we need to show that for any ε > 0, there is neigbourhood *N* of *x*<sup>0</sup> such that | *f*(*x*)| < ε for each *x* ∈ *N*. Indeed, since *f* ∈ *C*0(*U*), there is a compact set *K* ⊂ *U* such that | *f*(*x*)| < ε for each *x* ∈ *U*\*K* (and hence also for each *x* ∈ *X*\*K*). Then *<sup>x</sup>*<sup>0</sup> <sup>∈</sup>/ *<sup>K</sup>* (since *<sup>x</sup>*<sup>0</sup> <sup>∈</sup> *<sup>U</sup>c*), so, we may take the open neighbourhood *<sup>N</sup>* <sup>=</sup> *<sup>X</sup>*\*K*.

Since the ordering in *C*0(*X*) is pointwise, it is trivial that *C*0(*U*) ∈ *H*(*C*0(*X*)). The map (C.185) also clearly preserves the order, i.e., if *U* ⊆ *V*, then *C*0(*U*) ⊆ *C*0(*V*).

Half of the proof that (C.185) and (C.187) are mutually inverse is the equality

$$\mathcal{C}\_0(U) = \mathcal{C}\_0(X; X \backslash U),\tag{C.189}$$

where for any *F* ⊂ *X* (usually taken to be closed), we define *C*0(*X*;*F*) ⊂ *C*0(*X*) by

$$\mathcal{C}\_0(X; F) = \{ f \in \mathcal{C}\_0(X) \mid f\_{|F} = 0 \}. \tag{C.190}$$

To prove (C.189), we just need to prove that *C*0(*X*;*X*\*U*) ⊆ *C*0(*U*), since the opposite inclusion has been proved before Proposition C.83. Since *f* ∈ *C*0(*X*), for each <sup>ε</sup> <sup>&</sup>gt; 0 and each boundary point *<sup>x</sup>* <sup>∈</sup> <sup>∂</sup>*Uc*, there is an open neighbourhood *Nx* of *<sup>x</sup>* where | *f* | < ε, as well as a compact set *K* ⊂ *X* outside which the same is true. Then *V* = ∪*x*∈∂*UcUx* ∩*U* is open in *U*, so that its complement *U*\*V* is closed in *U*, and *K* = (*U*\*V*)∩*K* is compact in *U*. Clearly, | *f* | < ε outside *K* , whence *f* ∈ *C*0(*U*).

Having proved (C.189), the other half of the proof of bijectivity of (C.184) is

$$B = C\_0(X; F\_B),\tag{C.191}$$

for any *B* ∈ *H*(*C*0(*X*)). The inclusion *B* ⊆ *C*0(*X*;*FB*) is trivial. For the converse, we exploit the fact that *B* is an ideal in *C*0(*X*), so that *C*0(*X*)/*B* is a C\*-algebra by Theorem C.60. Let τ :*C*0(*X*) →*C*0(*X*)/*B* be the canonical projection. If *f* ∈/ *B*, then τ(*f*) = 0. Hence there is a character ω ∈ Σ(*C*0(*X*)/*B*), such that ω (τ(*f*)) = 0.

Lift ω to ω = ω ◦ τ ∈ Σ(*C*0(*X*)) ∼= *X*, so that there is *x* ∈ *X* such that ω(*g*) = *g*(*x*) for all *g* ∈ *C*0(*X*). Since τ(*g*) = 0 for each *g* ∈ *B*, we have ω(*g*) = 0, and hence *g*(*x*) = 0 for each *g* ∈ *B*, so that *x* ∈ *FB*. But *f*(*x*) = 0, so *f* ∈/ *C*0(*X*;*FB*), and hence we have proved the inclusion *C*0(*X*;*FB*) ⊆ *B*. -

Thus C.83 could just as well have been formulated in terms of closed sets, albeit at the cost of inverting the partial order. Also, note the isomorphism

$$\mathcal{C}\_0(X)/\mathcal{C}\_0(U) \stackrel{\cong}{\longrightarrow} \mathcal{C}\_0(X \backslash U), \ [f] \mapsto f\_{|X \backslash U}. \tag{C.192}$$

Corollary C.84. *For any commutative C\*-algebra A, there is a frame isomorphism*

$$
\mathcal{O}(\Sigma(\mathbf{A})) \cong H(\mathbf{A}).\tag{\text{C.193}}
$$

This sheds new light on maximal ideals in *A* as points of the Gelfand spectrum Σ(*A*), cf. Proposition C.13. We need a lemma that applies to any frame O(*X*). A *prime element P* ∈ O(*X*) is an element *P* = # such that *U* ∧*V* ≤ *P* iff *U* ≤ *P* or *V* ≤ *P*. For a point *<sup>p</sup>*−<sup>1</sup> : <sup>O</sup>(*X*) <sup>→</sup> 2, we write ker(*p*−1) for {*<sup>U</sup>* <sup>∈</sup> <sup>O</sup>(*X*) <sup>|</sup> *<sup>p</sup>*−1(*U*) = <sup>0</sup>}.

Lemma C.85. *For any frame* O(*X*) *(i.e. locale X ), there is a bijective correspondence between points p*−<sup>1</sup> : <sup>O</sup>(*X*) <sup>→</sup> <sup>2</sup> *of X and prime elements P* <sup>∈</sup> <sup>O</sup>(*X*)*, viz.*

$$P = \bigvee \ker(p^{-1});\tag{C.194}$$

$$p^{-1}(U) = 0 \text{ iff } U \le P. \tag{C.195}$$

*Under this correspondence, the topology on* Pt(*X*) *is given by the* Zariski topology*, whose* closed *sets FP consist of all Q* ⊇ *P, where P is some prime element of* O(*X*)*.*

*Proof.* The requirement that *p*−<sup>1</sup> be a frame map implies the following properties of its kernel *<sup>K</sup>* <sup>=</sup> ker(*p*−1): # <sup>∈</sup>/ *<sup>K</sup>*, *<sup>U</sup>* <sup>∧</sup>*<sup>V</sup>* <sup>∈</sup> *<sup>K</sup>* iff *<sup>U</sup>* <sup>∈</sup> *<sup>K</sup>* or *<sup>V</sup>* <sup>∈</sup> *<sup>K</sup>*, and *<sup>S</sup>* <sup>∈</sup> *<sup>K</sup>* iff each *V* ∈ *S* is in *K*. Any subset *K* ⊂ O(*X*) satisfying these properties in turn defines a point *p* of *X* whose kernel is *K*. Then *P* = *K* is a prime element of O(*X*), and conversely, *K* (and hence *p*) may be recovered from *P* as its downset *K* =↓ *P*.

The given topology on the set of prime elements is a rewriting of (C.175). -

The prime elements of *H*(*A*), where *A* is a commutative C\*-algebra, are the *prime ideals* in *A*, i.e., the proper ideals *J* ⊂ *A* such that *J*1*J*<sup>2</sup> ⊂ *J* iff *J*<sup>1</sup> ⊆ *A* or *J*<sup>2</sup> ⊆ *A*, for any ideals *J*1, *J*<sup>2</sup> of *A* (closed by definition, like *J*); note that *J*1*J*<sup>2</sup> = *J*<sup>1</sup> ∩*J*2.

Theorem C.86. *1. The frame H*(*A*) *of hereditary subalgebras of a commutative C\* algebra A is spatial, with* Pt(*H*(*A*)) ∼= Σ(*A*) *as topological spaces.*


#### C.12 The structure of C\*-algebras

Having understood the structure of commutative C\*-algebras, we now turn to the general case. We already know that the algebra *B*(*H*) of all bounded operators on some Hilbert space *H* is a C\*-algebra in the obvious way (i.e., the algebraic operations are the natural ones, the involution is the operator adjoint *a* → *a*∗, and the norm is the operator norm of Banach space theory). Moreover, each (operator) norm-closed ∗-algebra in *B*(*H*) is a C\*-algebra. Our goal is to prove the converse:

Theorem C.87. *Each C\*-algebra A is isomorphic to a norm-closed* ∗*-algebra in B*(*H*)*, for some Hilbert space H. Equivalently, for any C\*-algebra A there exist a Hilbert space H and an injective homomorphism* π : *A* → *B*(*H*)*.*

A homomorphism π : *A* → *B*(*H*) is called a *representation* of *A* on *H*. The equivalence between the two statements in the theorem follows from Theorem C.62.

Let us note that Theorems C.8 and C.87 harmonize as follows: any measure μ on *X* satisfying μ(*U*) > 0 for each open *U* ⊂ *X* leads to an injective representation of *C*0(*X*) on *L*2(*X*,μ) by multiplication operators, that is, π(*f*) = *mf* , cf. (B.238).

The proof of Theorem C.87 uses the elegant GNS*-construction*, named after Gelfand, Naimark, and Segal, which is important in its own right. We initially assume that *A* is unital. First, we call a representation π *cyclic* if its carrier space *H* contains a *cyclic vector* Ω for π, i.e., the closure of π(*A*)Ω coincides with *H*.

Theorem C.88. *Let* ω *be a state on a C\*-algebra A. There exists a cyclic representation* πω *of A on a Hilbert space H*<sup>ω</sup> *with cyclic unit vector* Ωω *such that*

$$
\Delta \mathcal{O}(a) = \langle \mathcal{Q}\_{\mathcal{O}}, \pi\_{\mathcal{O}}(a) \mathcal{Q}\_{\mathcal{O}} \rangle, \ a \in A. \tag{C.196}
$$

*Proof.* We first give the proof in the special case that *A* has a unit 1*A*, and ω(*a*∗*a*) > 0 for all *a* = 0. Define a sesquilinear form (−,−) on *A* by

$$\mathfrak{a}(a,b) = \mathfrak{a}(a^\*b). \tag{C.197}$$

This form is positive definite by definition of a state, so that we may complete *A* in the ensuing norm

$$||a||\_{\mathfrak{o}} = \sqrt{\mathfrak{o}(a^\*a)},\tag{C.198}$$

to a Hilbert space called *H*ω. For each *a* ∈ *A*, we then define a map

$$
\pi\_{\mathfrak{w}}(a) : A \to A;\tag{C.199}
$$

$$
\pi\_{\mathfrak{o}}(a)b = ab. \tag{C.200}
$$

Regarding *A* as a dense subspace of *H*ω, this defines an operator πω(*a*) on a dense domain in *H*ω. This operator is bounded, since (C.94) implies

$$\|\pi\_{\mathfrak{o}}(a)\| \le \|a\|. \tag{C.201}$$

Hence πω(*a*) may be extended from *A* to *H*<sup>ω</sup> by continuity, and we obtain a map πω : *A* → *B*(*H*ω). Simple computations show that πω is a representation. The special vector Ωω is the unit 1*<sup>A</sup>* ∈ *A*, seen as an element of *H*ω: its cyclicity is obvious, and:

$$\left\|\left|\mathfrak{Q}\_{\mathfrak{o}}\right\|\right\|^2 = \left\langle\mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}}, \mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}}\right\rangle = \mathfrak{o}(1\_A^\*1\_A) = \mathfrak{o}(1\_A) = 1;\tag{C.202}$$

$$
\langle \Omega\_{a0}, \pi\_{a0}(a)\Omega\_{a0} \rangle = \mathfrak{o}(1\_A^\* a 1\_A) = \mathfrak{o}(a). \tag{C.203}
$$

Under our standing assumption ω(*a*∗*a*) > 0 if *a* = 0, this not only proves Theorem C.88, but also Theorem C.87: for πω(*a*) = 0 implies πω(*a*)Ωω<sup>2</sup> <sup>=</sup> 0, whose lefthand side is precisely (Ωω,πω(*a*∗*a*)Ωω) = ω(*a*∗*a*). Thus πω is faithful.

In general, a C\*-algebra may lack such states, and we must adapt the proof of both theorems. The GNS-construction is easy: for an arbitrary state ω, we introduce

$$N\_{\mathfrak{o}\mathfrak{o}} = \{ a \in A \mid \mathfrak{o}(a^\*a) = 0 \}. \tag{C.204}$$

If *a*<sup>ω</sup> is the image of *a* ∈ *A* in *A*/*N*ω, we may define an inner product on the latter by

$$
\langle a\_{\mathfrak{o}\mathfrak{o}}, b\_{\mathfrak{o}\mathfrak{o}} \rangle = \mathfrak{o}(a^\*b);\tag{C.205}
$$

this is well defined and positive definite, and we define the Hilbert space *H*<sup>ω</sup> as the completion of *A*/*N*<sup>ω</sup> in this inner product. Furthermore, we define

$$
\pi\_{\mathfrak{o}}(a) : A/N\_{\mathfrak{o}} \to H\_{\mathfrak{o}};\tag{C.206}
$$

$$
\pi\_{a0}(a)b\_{a0} = (ab)\_{a0};\tag{C.207}
$$

this is well defined, because *N*<sup>ω</sup> is a left ideal in *A* by (C.94). Finally, we define

$$
\mathfrak{Q}\_{\mathfrak{o}\mathfrak{o}} = (1\_A)\_{\mathfrak{o}\mathfrak{i}}.\tag{\text{C.208}}
$$

The proof that everything works is then a simple exercise. Another way to look at the cyclic vector Ωω is to let ω define a linear functional ω˜ : *A*/*N*<sup>ω</sup> → C by

$$\mathfrak{d}(a\_{\mathfrak{a}}) = \mathfrak{a}(a);\tag{C.209}$$

this functional is continuous on *A*/*N*<sup>ω</sup> ⊂ *H*ω, because |ω(*a*)| <sup>2</sup> <sup>≤</sup> <sup>ω</sup>(*a*∗*a*) = *a*ω<sup>2</sup> *<sup>H</sup>*<sup>ω</sup> , as follows from the Cauchy–Schwarz inequality for the positive semidefinite form (C.197). Hence by Riesz–Frechet there is an implementing vector ´ Ωω such that

$$
\alpha(a) = \langle \Omega\_{a^0}, a\_{a^0} \rangle. \tag{C.210}
$$

Finally, when *A* has no unit, in defining Ωω we either use the GNS-construction for the unitization *A*˙ and restrict πω˙(*A*˙) to *A* to define πω(*A*), or use (C.210). -

One of the nicest feature of the GNS-construction is the link between purity of the state ω and irreducibility of the corresponding representation πω.

Definition C.89. *We call a representation* π *of a C\*-algebra A on a Hilbert space H* irreducible *if the only closed subspaces K of H that are stable under* π(*A*) *(in the sense that if* ψ ∈ *K, then* π(*a*)ψ ∈ *K for all a* ∈ *A) are either K* = *H or K* = {0}*.*

Theorem C.90. *Each of the following conditions is equivalent to irreducibility:*


*Furthermore, if* ω *is a state on A, then* ω *is pure iff the corresponding* GNS*representation* πω *is irreducible.*

*Proof.* If π(*A*) = C · 1, then π(*A*) must contain a nontrivial self-adjoint element *a* (as it is a ∗-algebra), and hence also a nontrivial projection *e* (as the spectral projections *e*<sup>Δ</sup> = 1<sup>Δ</sup> (*a*) of *a*, defined as in Theorem B.102, lie in π(*A*) , too). But if *e* ∈ π(*A*) , then *eH* is stable under π(*A*), and hence π cannot be irreducible. Thus irreducibility implies 1. Conversely, if π(*A*) = C · 1, then π must be irreducible by the same argument, since if not, any projection onto some proper stable subspace *K* for π would be an nontrivial element of π(*A*) . The equivalence 1 ↔ 2 is clear, since (C·1) = *B*(*H*). Similarly, if ϕ ∈ *H* would fail to be cyclic for π, then π(*A*)ϕ<sup>−</sup> would be a proper, π(*A*)-stable subspace of *H*, so that irreducibility implies 3. The converse is trivial, since if *K* ⊂ *K* were stable for π(*A*), then 3 cannot hold. -

Another useful result relates general representations to GNS-representations. We call two representations π*<sup>i</sup>* : *A* → *B*(*Hi*), *i* = 1,2, *unitarily equivalent* if there is a unitary *u* : *H*<sup>1</sup> → *H*<sup>2</sup> such that *u*π1(*a*)*u*<sup>∗</sup> = π2(*a*) (or *u*π1(*a*) = π2(*a*)*u*) for each *a* ∈ *A*.

Proposition C.91. *Let* π : *A* → *B*(*H*) *be a cyclic representation of H. If* ψ ∈ *H is a cyclic unit vector for* π*, then*

$$\mathcal{O}(a) = \langle \Psi, \pi(a)\Psi \rangle \tag{C.211}$$

*is a state on A, whose* GNS*-representation* πω *is unitarily equivalent to* π*.*

*Proof.* Define *u* : *H*<sup>ω</sup> → *H* first on πω(*A*)Ωω (which is a dense subspace of *H*) by

$$
u \pi\_{\mathfrak{o}}(a) \Omega\_{\mathfrak{o}} = \pi(a) \Psi. \tag{C.212}$$

Using (C.211) and (C.196), we then obtain

$$\|\|\pi\_{\mathfrak{o}}(a)\Omega\_{\mathfrak{o}}\|\|^2 = \mathfrak{o}(a^\*a) = \langle \Psi, \pi(a^\*a)\Psi \rangle = \|\pi(a)\Psi\|^2. \tag{C.213}$$

This shows that *u* is well defined as well as isometric, so that it extends to *H*<sup>ω</sup> by continuity. Its image is then the closure of π(*A*)ψ, which is *H*, since ψ is cyclic by assumption. Thus *u* is surjective and hence unitary. Finally, we compute

$$
\mu \pi\_{\mathfrak{w}}(a) \pi\_{\mathfrak{w}}(b) \Omega\_{\mathfrak{w}} = \pi(a) \pi(b) \Psi = \pi(a) \mu \pi\_{\mathfrak{w}}(b) \Omega\_{\mathfrak{w}},\tag{C.214}
$$

so that *u*πω(*a*) = π(*a*)*u* on the dense space πω(*A*)Ωω, and thence everywhere. -

#### 694 C Operator algebras

We now take up the proof of Theorem C.87, preceded by some general remarks on direct sums of Hilbert spaces and representations. First, if (*H*1,...,*Hn*) is a finite family of Hilbert spaces, one may form the *direct sum H* = *H*<sup>1</sup> ⊕···⊕*Hn*, initially merely as a vector space, and subsequently also as a space with inner product

$$
\langle (\varphi\_1, \dots, \varphi\_n), (\Psi\_1, \dots, \Psi\_n) \rangle = \sum\_{i=1}^n \langle \varphi\_i, \Psi\_i \rangle. \tag{C.215}
$$

It is easy to see that *H* is complete in the ensuing norm

$$\|\|(\Psi\_1, \dots, \Psi\_n)\|\|^2 = \sum\_{i=1}^n \|\|\Psi\_i\|\|^2. \tag{C.216}$$

Some authors write <sup>ψ</sup><sup>1</sup> ⊕···⊕ψ*n*, <sup>ψ</sup>1+˙ ···+˙ <sup>ψ</sup>*n*, or <sup>ψ</sup><sup>1</sup> <sup>+</sup>···+ψ*<sup>n</sup>* for (ψ1,...,ψ*n*).

Moreover, if (π*i*) is a family of representations π*<sup>i</sup>* : *A* → *B*(*Hi*), then one obtains a new representation 3 *<sup>i</sup>* π*<sup>i</sup>* of *A*, called the *direct sum* of the π*i*, by

$$\bigoplus\_{i} \pi\_{i}(a)(\psi\_{1}, \dots, \psi\_{n}) = (\pi\_{1}(a)\psi\_{1}, \dots, \pi\_{n}(a)\psi\_{n}).\tag{C.217}$$

This construction works for arbitrary families of Hilbert spaces (*Hx*) and representations (π*x*), where *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>* for some index set *<sup>X</sup>*. First, the elements of *<sup>H</sup>* <sup>=</sup> <sup>3</sup> *<sup>x</sup> Hx* are families (ψ) ≡ (ψ*x*)*x*∈*<sup>X</sup>* , where ψ*<sup>x</sup>* ∈ *Hx*, such that

$$\left\| \left( \Psi \right) \right\|\!^2 = \sup\_{F \subset X} \sum\_{\mathbf{x} \in F} \left\| \Psi\_{\mathbf{x}} \right\|\!^2\_{H\_X} < \infty,\tag{C.218}$$

where the supremum is over all finite subsets *F* of *X*, so that the sum is defined as in (B.11). In that case, the obvious linear operations (i.e., ((ψ)+(ϕ))*<sup>x</sup>* = ψ*<sup>x</sup>* +ϕ*<sup>x</sup>* and (λ(ψ))*<sup>x</sup>* = λ ·ψ*x*) are defined within *H*, since for each pair (ϕ),(ψ) ∈ *H* we have, from the triangle inequality for the norm in each finite direct sum *HF* = 3 *<sup>x</sup>*∈*<sup>F</sup> Hx*,

$$\left(\sum\_{\mathbf{x}\in F} \|\|\boldsymbol{\Psi}\_{\mathbf{x}} + \boldsymbol{\Phi}\_{\mathbf{x}}\|\|\_{H\_{\mathbf{x}}}^2\right)^{1/2} \le \left(\sum\_{\mathbf{x}\in F} \|\|\boldsymbol{\Psi}\_{\mathbf{x}}\|\|^2\right)^{1/2} + \left(\sum\_{\mathbf{x}\in F} \|\|\boldsymbol{\Phi}\_{\mathbf{x}}\|\|^2\right)^{1/2} \le \|\|(\boldsymbol{\Psi})\|\|^2 + \|\|(\boldsymbol{\Phi})\|\|^2.$$

The supremum over *F* gives (ψ)+(ϕ), which is therefore finite and satisfies the triangle inequality for the norm. Similarly, the natural inner product in *H* is well defined, this time by the full Definition B.6, with *V* = C and *f*(*x*) = ϕ*x*,ψ*xHx* , i.e.,

$$
\langle (\!\!\!\!\!\!\!\!\/), (\!\!\!\!\/) \rangle = \sum\_{\mathbf{x} \in \mathcal{X}} \langle \!\!\!\!\/\!\!\/ \mathbf{x}, \Psi\_{\mathbf{x}} \rangle\_{H\_{\mathbf{x}}}.\tag{\text{C.219}}
$$

To see this, we apply Cauchy–Schwarz first in each *Hx* and then in -<sup>2</sup>(*X*) to obtain

$$|\langle (\mathfrak{q}), (\mathfrak{v}) \rangle| \le \sum\_{\mathfrak{x} \in X} |\langle \mathfrak{q}\_{\mathfrak{x}}, \mathfrak{w}\_{\mathfrak{x}} \rangle\_{H\_{\mathfrak{x}}}| \le \sum\_{\mathfrak{x}} |\|\mathfrak{q}\_{\mathfrak{x}}\| \|\|\mathfrak{w}\_{\mathfrak{x}}\|| \le \|(\mathfrak{q})\| \|(\mathfrak{v})\| < \infty. \tag{C.220}$$

Finally, the proof that the direct sum Hilbert space 3 *<sup>x</sup> Hx* is complete in the norm (C.218) is similar to the case where *Hx* = C for each *x*, i.e., *H* = -<sup>2</sup>(*X*), cf. Theorem B.9. Let (ψ)*<sup>n</sup>* be a Cauchy sequence in *<sup>H</sup>*, consisting of sequences (ψ*x*)*<sup>n</sup>* <sup>≡</sup> <sup>ψ</sup>(*n*) *<sup>x</sup>* in each *Hx*. For each finite *<sup>F</sup>* <sup>⊂</sup> *<sup>X</sup>* and <sup>ε</sup> <sup>&</sup>gt; 0, we must have <sup>∑</sup>*x*∈*<sup>F</sup>* ψ(*n*) *<sup>x</sup>* <sup>−</sup>ψ(*m*) *<sup>x</sup>* <sup>&</sup>lt; <sup>ε</sup> for sufficiently large *n*,*m* so that each (ψ*x*)*<sup>n</sup>* must be Cauchy in *Hx*, with limit ψ*x*. The ensuing set (ψ) of vectors lies in *H* by the argument following (B.19), and the given Cauchy sequence (ψ)*<sup>n</sup>* converges to (ψ), again by the same proof as for -<sup>2</sup>(*X*).

If one has a family (π*x*) of representations <sup>π</sup>*<sup>x</sup>* : *<sup>A</sup>* <sup>→</sup> *<sup>B</sup>*(*Hx*), their direct sum <sup>π</sup> <sup>=</sup> <sup>3</sup> *<sup>x</sup>* π*x*, defined by (π(*a*)(ψ))*<sup>x</sup>* = π*x*(*a*)ψ*x*, is a representation of *A* on *H*. Indeed, one has π(*a*) = sup*x*{π*x*(*a*)}, and since we have π*x*(*a*)≤*a* for each *x*, we also have π(*a*)≤*a*, so that π(*a*) ∈ *B*(*H*), and hence π maps *A* into *B*(*H*).

Our first use of such direct sums shows that cyclic representation are the building blocks of any representation π, at least if we require π to be *nondegenerate* in the sense that π(*a*)ψ = 0 for all *a* ∈ *A* and ψ ∈ *H* implies ψ = 0.

Proposition C.92. *Any nondegenerate representation* π : *A* → *B*(*H*) *of a C\*-algebra A on a Hilbert space H is a direct sum of cyclic representations of A.*

*Proof.* Consider families (ψ*x*)*x*∈*<sup>X</sup>* of nonzero vectors in *H* with the property that

$$
\langle \pi(a)\Psi\_{\sf x}, \pi(b)\Psi\_{\sf x'}\rangle = 0,\tag{C.221}
$$

for all *a*,*b* ∈ *A* and all *x* = *x* . Such families are partially ordered by inclusion, and an easy application of Zorn's Lemma shows that there is a maximal such family. For this family (ψ*x*)*x*∈*<sup>X</sup>* , we define *Hx* as the closure of π(*A*)ψ*<sup>x</sup>* in *H*. Since π is a homomorhism, each *Hx* is stable under π(*A*), and hence the restriction π*x*(*a*) of π(*a*) to *Hx* defines a representation of *A*, which is cyclic by construction. It follows that *H* = 3 *<sup>x</sup> Hx* and π = <sup>3</sup> *<sup>x</sup>* π*x*, and so the claim has been proved. -

Our second use is the proof of Theorem C.87, where we have to solve the problem of the possible lack of injectivity of πω in our previous preliminary proof.

*Proof.* To do so, we replace *H*<sup>ω</sup> by the *c*razy Hilbert space *Hc* = 3 <sup>ω</sup>∈*P*(*A*) *H*ω, where *P* 3 (*A*) is the pure state space of *A*. The Hilbert space *Hc* carries a representation π = <sup>ω</sup>∈*P*(*A*) πω. The point is that if π(*a*) = 0, then π(*a*∗*a*)Ωω = 0 for each ω ∈ *P*(*A*), which by (C.196) implies ω(*a*∗*a*) = 0. Proposition C.15 then gives σ(*a*∗*a*) = {0}, from which the spectral radius formula (C.55) gives *a* = 0, and hence *a* = 0. It follows that π is injective, and Theorem C.87 is proved. -

It should be noted that this proof relies on *shock and awe* kind of overkill (though nothing compared to the even crazier space *Hec* = 3 <sup>ω</sup>∈*S*(*A*) *H*ω, which is traditionally used in the above proof), in that *Hc* is far larger than necessary (indeed, in all but the most trivial cases, *H* is non-separable). For example, already for *A* = *M*2(C) we have *P*(*A*) ∼= *S*2, so that *Hc* = 3 <sup>ω</sup>∈*S*<sup>2</sup> <sup>C</sup>2; this Hilbert space is non-separable, whereas *A* has an injective representation on C2. More generally, *B*0(*H*) or *B*(*H*) has an injective representation on *H* by definition, whereas *Hc* is non-separable. In the commutative case, *A* = *C*0(*X*) yields the non-separable *Hc* = 3 *<sup>x</sup>*∈*<sup>X</sup>* C, although *A* has an injective representation on the (typically) separable space *L*2(*X*,μ).

As a nice illustration of the GNS-construction, let us treat this example in more detail (cf. §1.5 for the simple case where *X* is finite). If μ is some state on *C*0(*X*), then by Theorem B.24, there is a unique probability measure μ on *X* such that

$$\mathfrak{w}(f) = \int\_X d\mu \, f, \, f \in \mathcal{C}\_0(X),\tag{C.222}$$

cf. (B.39). It follows from (C.204) and (C.222) that

$$N\_{\mathfrak{w}} = \left\{ f \in \mathcal{C}\_0(X) : \int\_X d\mu \, |f|^2 = 0 \right\}. \tag{C.223}$$

In particular, the support of μ is *X* iff *N*<sup>ω</sup> = {0}, in which case *A*/*N*<sup>ω</sup> = *C*0(*X*). In the opposite case where ω is a pure state, i.e., ω = ω*<sup>x</sup>* for some *x* ∈ *X*, with ω*x*(*f*) = *f*(*x*), one has *N*<sup>ω</sup> = { *f* ∈ *C*0(*X*) | *f*(*x*) = 0}, so that *A*/*N*<sup>ω</sup> ∼= C, under the map [ *f* ] → *f*(*x*). In general, from (C.206) - (C.207) we obtain

$$H\_{\mathfrak{w}} = L^2(X, \mu);\tag{C.224}$$

$$
\pi\_{\mathfrak{w}}(f) = m\_f;\tag{C.225}
$$

$$
\Omega\_{\mathfrak{o}\mathfrak{o}} = \mathfrak{l}\_X,\tag{\text{C.226}}
$$

where *mf*ψ = *f*ψ, cf. (B.238). Analogously to (B.331), we then obtain

$$
\pi\_{\mathfrak{w}}(C\_0(X))'' = L^{\infty}(X, \mathfrak{\mu}).\tag{C.227}
$$

The state ω, initially defined on the commutative C\*-algebra *C*0(*X*), then has a normal extension to the commutative von Neumann algebra *L*∞(*X*,μ), cf. (C.222).

More generally, if *A* is an arbitrary commutative C\*-algebra and ω is a state on *A*, then, writing Σ(*A*) for the Gelfand spectrum of *A* as usual, we have

$$H\_{\mathfrak{a}} \cong L^2(\Sigma(A), \mathfrak{\mu});\tag{C.228}$$

$$
\pi\_{\mathfrak{w}}(f) \cong m\_{\hat{f}};\tag{C.229}
$$

$$\mathcal{Q}\_{\mathfrak{o}} \cong \mathbb{1}\_{\Sigma(A)},\tag{\text{C.230}}$$

where <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(Σ(*A*)) is the Gelfand transform of *<sup>f</sup>* <sup>∈</sup> *<sup>A</sup>*, and <sup>μ</sup> is the probabililty measure on Σ(*A*) defined by

$$\mathfrak{w}(f) = \int\_{\Sigma(A)} d\mu \,\hat{f}.\tag{C.231}$$

With this commutative case in mind, some authors would call a pair (*A*,ω), where *A* is a general C\*-algebra and ω is a state on *A*, or, alternatively, *A* is a general von Neumann algebra and ω is a normal state on *A*, a *non-commutative probability space*. As such, 'aordinary" probability theory (at least, on locally compact Hausdorff sample spaces) is merely the commutative case of a much more general "non-commutative probability theory".

#### C.13 Tensor products of Hilbert spaces and C\*-algebras

If *HA* and *HB* are Hilbert spaces, their algebraic tensor product *HA* ⊗ *HB* typically fails to be a Hilbert space in the obvious way, since it is not complete (unless one of the factors is finite-dimensional). Similarly, the algebraic tensor product *A*⊗*B* of two C\*-algebras *A* and *B* usually fails to be a C\*-algebra. However, the second case is far more complicated then the first: for Hilbert spaces there is a canonical norm on the algebraic tensor product and hence a canonical completion of *HA* ⊗*HB* into a Hilbert space *HA*⊗*HB*. For C\*-algebras, on the other hand, there is an *embarrasment of riches*, in that there are are many norms turning the completion *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* of *<sup>A</sup>*⊗*<sup>B</sup>* in some such norm into a C\*-algebra. However, if *A* or *B* is *nuclear*, there is just one possibility; see below. For example, this applies of *A* or *B* is finite-dimensional.

Let us first review the (algebraic) tensor product of two vector spaces. *A* and *B*.

Proposition C.93. *Let A and B be (complex) vector spaces. There is a vector space called A* ⊗ *B, in words the* algebraic tensor product *of A and B (over* C*), and a map p* : *A* × *B* → *A* ⊗ *B, such that for any vector space C and any bilinear map* β : *A*×*B* → *C, there is a unique linear map* β : *A*⊗*B* → *C such that* β = β ◦ *p.*

*In other words, the following diagram commutes:*

$$\begin{array}{c} A \times B \xrightarrow{p} A \otimes B \\ \searrow \\ \searrow \\ C \end{array} \Big\downarrow \begin{array}{c} \begin{array}{c} A \otimes B \\\\ \hline \end{array} \end{array} \tag{C.232}$$

This universal property also shows that *A*⊗*B* is unique up to isomorphism.

*Proof.* In preparation for an explicit construction of *A*⊗*B*, define the (complex) *free vector space* on any non-empty set *X* as*Cc*(*X*), where *X* has the discrete topology (i.e., *Cc*(*X*) consists of all functions *f* : *X* → C with finite support), and pointwise operations. For each *y* ∈ *X*, the delta-function δ*<sup>y</sup>* ∈ *Cc*(*X*) is defined by δ*y*(*x*) = δ*xy*, so that each element *f* of*Cc*(*X*) is a finite sum *f* = ∑*<sup>i</sup>* λ*i*δ*xi* , where λ*<sup>i</sup>* ∈ C and *xi* ∈ *X*.

If *A* and *B* are (complex) vector spaces, *A* ⊗ *B* is the quotient of the free vector space*Cc*(*A*×*B*) on *X* = *A*×*B* by the equivalence relation generated by the relations:

$$
\delta\_{(a\_1+a\_2,b)} \sim \delta\_{(a\_1,b)} + \delta\_{(a\_2,b)}; \tag{C.233}
$$

$$
\delta\_{(a,b\_1+b\_2)} \sim \delta\_{(a,b\_1)} + \delta\_{(a,b\_2)};\tag{C.234}
$$

$$
\lambda \mathfrak{d}\_{(a,b)} \sim \mathfrak{d}\_{(\lambda a,b)}; \tag{C.235}
$$

$$
\lambda \delta\_{(a,b)} \sim \delta\_{(a,\lambda b)}.\tag{C.236}
$$

For *a* ∈ *A*,*b* ∈ *B*, the image of δ(*a*,*b*) in *A*⊗*B* is called *a*⊗*b*, so that by construction,

$$(a\_1 + a\_2) \otimes b = a\_1 \otimes b + a\_2 \otimes b;\tag{C.237}$$

$$a \otimes (b\_1 + b\_2) = a \otimes b\_1 + a \otimes b\_2;\tag{C.238}$$

$$
\lambda(a \otimes b) = (\lambda a) \otimes b = a \otimes (\lambda b). \tag{C.239}
$$

Elements of the algebraic tensor product *A* ⊗ *B* may therefore be written as finite sums *c* = ∑*<sup>i</sup> ai* ⊗*bi*, with *ai* ∈ *A*, *bi* ∈ *B*, subject to the above relations.

Now consider some bilinear map β : *A*×*B* → *C*. We extend β to a map

$$
\tilde{\mathcal{B}} : \mathbb{C}\_c(A \times \mathcal{B}) \to \mathbb{C}; \tag{\text{C.240}}
$$

$$
\tilde{\beta}\left(\sum\_{i} \lambda\_i \mathcal{S}\_{(a\_i, b\_i)}\right) = \sum\_{i} \lambda\_i \beta\left(a\_i, b\_i\right). \tag{C.241}
$$

Since β is bilinear, it respects the above equivalence relation, so that it duly quotients to β : *A*⊗*B* →*C*, upon which the property β = β ◦ *p* holds by construction. Finally, since *p* is surjective the latter property uniquely determines β . -

Equivalently, *A* ⊗ *B* is the quotient of formal sums ∑*i*(*ai*,*bi*) by the subspace consisting of those sums for which there are ω*<sup>A</sup>* ∈ *A*<sup>∗</sup> and ω*<sup>B</sup>* ∈ *B*<sup>∗</sup> such that ∑*i*ω*B*(*ai*)ω*B*(*bi*) = 0. Similarly, it is useful to regard *A* ⊗ *B* as a subspace of the vector space *L*(*A*∗,*B*) of linear maps from the dual *A*∗ to *B* through the map

$$\sum\_{i} a\_{i} \otimes b\_{i} : \mathfrak{a}\_{\mathcal{A}} \mapsto \sum\_{i} \mathfrak{a}\_{\mathcal{A}}(a\_{i}) b\_{i} \ (\mathfrak{a}\_{\mathcal{A}} \in A^{\*});\tag{C.242}$$

this map is injective by Corollary B.45.2, since we may assume the *bi* to be linearly independent. Using the canonical embedding *B* → *B*∗∗ of Proposition B.44, this in turn yields an injection *A*⊗*B* → *L*(*A*<sup>∗</sup> ×*B*∗,C), i.e., the space of bilinear maps from *A*<sup>∗</sup> ×*B*<sup>∗</sup> to C, given on arguments (ω*A*,ω*B*) by

$$\sum\_{i} a\_{i} \otimes b\_{i} : (a\_{\text{A}}, a\_{\text{B}}) \mapsto \sum\_{i} a\_{\text{A}}(a\_{i}) a\_{\text{B}}(b\_{i}).\tag{C.243}$$

Proposition C.93 turns this into an injection *A*⊗*B* → *L*(*A*<sup>∗</sup> ⊗*B*∗,C), given by

$$\sum\_{i} a\_{i} \otimes b\_{i} : \sum\_{j} (a\_{\mathsf{A}})\_{j} \otimes (a\_{\mathsf{B}})\_{j} \mapsto \sum\_{i,j} (a\_{\mathsf{A}})\_{j} (a\_{i}) (a\_{\mathsf{B}})\_{j} (b\_{i}).\tag{C.244}$$

If *A* and *B* are Hilbert spaces, we call them *HA* and *HB*, denote their elements by α and β, respectively, and attempt to define a sesquilinear form on *HA* ⊗*HB* by

$$
\langle \sum\_{j} \alpha\_j' \otimes \beta\_j', \sum\_{i} \alpha\_i \otimes \beta\_i \rangle = \sum\_{i,j} \langle \alpha\_j', \alpha\_j \rangle\_A \langle \beta\_j', \beta\_i \rangle\_B. \tag{C.245}
$$

It is a non-trivial fact that this form is well defined, because representations ∑*<sup>i</sup>* α*i*⊗β*<sup>i</sup>* of vectors in *HA* <sup>⊗</sup>*HB* may not be unique. For example, if *HA* <sup>=</sup> *HB* <sup>=</sup> *<sup>H</sup>* <sup>=</sup> <sup>C</sup>*n*, and (α*i*) and (α *<sup>i</sup>*) are two bases of *H*, then ∑*<sup>i</sup>* α*<sup>i</sup>* ⊗α*<sup>i</sup>* = ∑*<sup>i</sup>* α *<sup>i</sup>* ⊗α *<sup>i</sup>* (to see this, take inner products with an arbitrary elementary tensor ψ ⊗ϕ, yielding the same result).

To resolve this, we note that the injection *HA* ⊗ *HB* → *L*(*H*<sup>∗</sup> *<sup>A</sup>* × *H*<sup>∗</sup> *<sup>B</sup>*,C) just discussed combines with the isomorphism *H*<sup>∗</sup> ∼= *H* of Theorem B.66 to an injection *HA* ⊗*HB* → *L*(*HA* ×*HB*,C), i.e., the space of *bi-anti-linear* maps from *HA* ×*HB* to C. Proposition C.93 turns this into an injection *HA* ⊗*HB* → *L*(*HA* ⊗*HB*,C), viz.

C.13 Tensor products of Hilbert spaces and C\*-algebras 699

$$
\sum\_{i} \alpha\_{i} \otimes \beta\_{i} : \sum\_{j} \alpha'\_{j} \otimes \beta'\_{j} \rangle \mapsto \sum\_{i,j} \langle \alpha'\_{j}, \alpha\_{i} \rangle\_{H\_{\Lambda}} \langle \beta'\_{j}, \beta\_{i} \rangle\_{H\_{\mathcal{B}}}.\tag{C.246}
$$

Consequently, if ∑*<sup>i</sup>* α*i*⊗β*<sup>i</sup>* = 0, then the right-hand-side of (C.245) is zero, too, since it is the image of ∑*<sup>j</sup>* α *<sup>j</sup>* ⊗β *<sup>j</sup>* under the zero map. Hence (C.245) is independent of the choice of representatives in the sum ∑*<sup>i</sup>* α*<sup>i</sup>* ⊗β*i*, and by hermiticity of the form, this equally well applies to the other entry ∑*<sup>j</sup>* α *<sup>j</sup>* ⊗β *j* .

It remains to show that (C.245) is an inner product, i.e., that it is positive definite. To see this, for some given vector ∑*<sup>i</sup>* α*<sup>i</sup>* ⊗β*<sup>i</sup>* in *HA* ⊗*HB* one may take the linear span *H <sup>A</sup>* of all α*<sup>i</sup>* in *HA*, which is a Hilbert space, and pick a basis (υ*i*) in *H <sup>A</sup>*. Absorbing the scalars in the β*j*, we may therefore write ∑*<sup>i</sup>* α*<sup>i</sup>* ⊗β*<sup>i</sup>* = ∑*<sup>k</sup>* υ*<sup>k</sup>* ⊗β *<sup>k</sup>* , so that

$$\langle \sum\_{i} \alpha\_{i} \otimes \beta\_{i}, \sum\_{i} \alpha\_{i} \otimes \beta\_{i} \rangle = \sum\_{k,l} \langle \mathfrak{v}\_{k} \otimes \beta\_{k}^{\prime\prime}, \mathfrak{v}\_{l} \otimes \beta\_{l}^{\prime\prime} \rangle = \sum\_{k} ||\mathfrak{f}\_{k}^{\prime\prime}||\_{B}^{2} \ge 0,\qquad(\text{C.247})$$

with equality at the end iff each β *<sup>k</sup>* = 0, and hence ∑*<sup>i</sup>* α*<sup>i</sup>* ⊗β*<sup>i</sup>* = 0.

Finally, we complete *HA* ⊗*HB* in the norm defined by the inner product (C.245); with abuse of notation the ensuing Hilbert space is often just called *HA* ⊗*HB*, but it would be more precise to denote it by *HA*⊗*HB*, as we will usually do.

It is easy to show that if (υ(*A*) *<sup>i</sup>* ) and (υ(*B*) *<sup>j</sup>* ) are bases for *HA* and *HB*, respectively, then (υ(*A*) *<sup>i</sup>* <sup>⊗</sup> <sup>υ</sup>(*B*) *<sup>j</sup>* ) is a basis of *HA*⊗*HB*. Also, if (*X*,Σ,μ) and (*X* ,Σ ,μ ) are σfinite measure spaces with *X* and *X* well behaved (e.g., Polish), so that the *L*2-spaces are separable, one has a natural isomorphism

$$L^2(X, \Sigma, \mu) \hat{\otimes} L^2(X', \Sigma', \mu') \cong L^2(X \times X', \Sigma \times \Sigma', \mu \times \mu'),\tag{C.248}$$

obtained as the closure of the isometric (and hence bounded) map that sends the vector ∑*i*ψ*<sup>i</sup>* ⊗ψ *<sup>i</sup>* into the function (*x*, *x* ) → ∑*i*ψ*i*(*x*)ψ *i*(*x* ) on *X* ×*X* . Here Σ ×Σ is the smallest σ-algebra on *X* ×*X* that contains all sets *A*×*A* , *A* ∈ Σ, *A* ∈ Σ , and μ × μ is the familiar product measure defined on elementary measurable sets by

$$
\mu \times \mu'(A \times A') = \mu(A)\mu'(A'). \tag{C.249}
$$

We now turn to tensor products of C\*-algebras. If *A* and *B* are C\*-algebras, then the algebraic tensor product *A*⊗*B* of *A* and *B* (just seen as vector spaces) is endowed with a natural multiplication and involution, given by linear extension of

$$(a\_1 \otimes b\_1) \cdot (a\_2 \otimes b\_2) = (a\_1 a\_2) \otimes (b\_1 b\_2);\tag{C.250}$$

$$(a \otimes b)^{\*} = a^{\*} \otimes b^{\*},\tag{C.251}$$

respectively. Thus *A*⊗*B* is a <sup>∗</sup>-algebra, and Proposition C.93 specializes to:

Proposition C.94. *If C is a* <sup>∗</sup>*-algebra and if a bilinear map* β : *A*×*B* → *C satisfies*

$$
\beta(a\_1a\_2, b\_1b\_2) = \beta(a\_1, a\_2)\beta(b\_1, b\_2); \ \beta(a^\*, b^\*) = \beta(a, b)^\*,\tag{C.252}
$$

*then* β *factors through A*⊗*B (now seen as a* <sup>∗</sup>*-algebra), as in* (C.232)*.*

The proof is similar. In order to turn *A*⊗*B* (seen as a <sup>∗</sup>-algebra) into a C\*-algebra, we need a *C\*-norm*, i.e., a norm on *A* ⊗ *B* satisfying the C\*-axioms (C.1) - (C.2). If such a norm exists, we denote the completion of *A*⊗*B* in that particular norm by *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>*, where typically · and hence <sup>⊗</sup><sup>ˆ</sup> carry some label. This completion *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* is a C\*-algebra in the obvious way. There will be no shortage of such norms!

For example, suppose *A* ⊂ *B*(*HA*) and *B* ⊂ *B*(*HB*). For each *a* ∈ *A*, we form the operator *a*⊗1*<sup>B</sup>* on *HA* ⊗*HB* (where 1*<sup>B</sup>* is the unit of *B*(*HB*), which is also the unit of *B* if it has one). As in (C.247), we may assume that generic elements of *HA*⊗*HB* take the form ∑*<sup>k</sup>* υ*<sup>k</sup>* ⊗β*k*, with the υ*<sup>k</sup>* orthonormal in *HA* and β*<sup>k</sup>* ∈ *HB*. We then estimate

$$\begin{split} \left\|(a \otimes \mathbbm{1}\_{B}) \left(\sum\_{k} \mathfrak{v}\_{k} \otimes \mathfrak{B}\_{k}\right)\right\|^{2} &= \left\|\sum\_{k} (a\mathfrak{v}\_{k}) \otimes \mathfrak{B}\_{k}\right\|^{2} \leq \sum\_{k} \|(a\mathfrak{v}\_{k}) \otimes \mathfrak{B}\_{k}\|^{2} \\ &\leq \|a\|^{2} \left\|\sum\_{k} \mathfrak{v}\_{k} \otimes \mathfrak{B}\_{k}\right\|^{2} .\end{split} \tag{C.253}$$

Hence *a*⊗1*<sup>B</sup>* is bounded on the pre-Hilbert space *HA*⊗*HB*, and extends to a bounded operator on *HA*⊗*HB* by continuity; this extension is usually called *a*⊗1*B*, too. Similarly, any *b* ∈ *B* defines a bounded operator 1⊗*b* on *HA*⊗*HB*, and since

$$a \otimes b = (a \otimes 1\_B) \cdot (1\_A \otimes b), \tag{C.254}$$

all elements ∑*<sup>i</sup> ai* ⊗*bi* of *A*⊗*B* extend to elements of *B*(*HA*⊗*HB*). Now define

$$||\sum a\_i \otimes b\_i||\_{\text{min}} = ||\sum a\_i \otimes b\_i||\_{B(H\_A \boxplus H\_B)}.\tag{C.255}$$

This is clearly a C\*-norm on *A*⊗*B*. Moreover, it is a *cross-norm*, in that

$$\|a \otimes b\|\_{\text{min}} = \|a\| \|b\|. \tag{C.256}$$

This construction generalizes to any two C\*-algebras, since by Theorem C.87 we have injective representations π*<sup>A</sup>* : *A* → *B*(*HA*) and π*<sup>B</sup>* : *A* → *B*(*HB*) of *A* and *B*, respectively, and it is easy to verify that the norm ·min on *A* ⊗ *B* and ensuing completion *<sup>A</sup>*⊗<sup>ˆ</sup> min*<sup>B</sup>* are independent of the chosen representation. Furthermore,

$$||c||\_{\min} = \sup\{ ||\mathfrak{x}\_{\mathcal{A}} \otimes \mathfrak{x}\_{\mathcal{B}}(c)||\_{B(H\_{\mathcal{A}} \boxplus H\_{\mathcal{B}})} \},\tag{C.257}$$

where π*<sup>A</sup>* and π*<sup>B</sup>* run through all representations of *A* and *B*, respectively. The ensuing completion *<sup>A</sup>*⊗<sup>ˆ</sup> min*<sup>B</sup>* is called the *injective* tensor product of *<sup>A</sup>* and *<sup>B</sup>*. Without proof (which requires more advanced methods than the elementary arguments we use in this section), we mention that, as its name suggests, ·min is the smallest C\*-norm on *A*⊗*B*. This has a very important consequence:

Proposition C.95. *Any C\*-norm* · *on A*⊗*B satisfies a*⊗*b* = *ab.*

In other words, any C\*-norm · on *A*⊗*B* is a cross-norm. To prove this from the minimality of the spatial norm, we need a lemma of wider interest.

Lemma C.96. *If* · *is any C\*-norm on A*⊗*B, then for all a* ∈ *A and b* ∈ *B,*

$$\|a \otimes b\| \le \|a\| \|b\|. \tag{C.258}$$

*Consequently, for any C\*-norm on A*⊗*B and any c* ∈ *A*⊗*B, we have the bound*

$$||c|| \le \inf \left\{ \sum\_{l} ||a\_{l}|| ||b\_{l}||, c = \sum\_{l} a\_{l} \otimes b\_{l} \right\}.\tag{C.259}$$

*Proof.* In any C\*-algebra *<sup>A</sup>*, if *<sup>a</sup>* <sup>≥</sup> 0, we have *a* ≤ 1 iff *<sup>a</sup>*<sup>2</sup> <sup>≤</sup> *<sup>a</sup>*. This is trivial for *A* = *C*(*X*), and in general can be proved within *C*∗(*a*) ⊂ *A*, since *C*∗(*a*) ∼= *C*(σ(*a*)). Now take *a* ∈ *A* and *b* ∈ *B* such that *a* ≥ 0, *b* ≥ 0, *a* ≤ 1, and *b* ≤ 1, so that (*a*⊗*b*)<sup>2</sup> <sup>=</sup> *<sup>a</sup>*2⊗*b*<sup>2</sup> <sup>≤</sup> *<sup>a</sup>*⊗*b*<sup>2</sup> <sup>≤</sup> *<sup>a</sup>*⊗*b*, and hence *a*⊗*b* ≤ 1. For general *<sup>a</sup>* <sup>≥</sup> 0, *<sup>b</sup>* <sup>≥</sup> 0, rescaling to *a*/*a* etc. gives (C.258). For general *a*,*b* altogether, we compute:

$$\left\| \left| a \otimes b \right| \right\|^2 = \left\| \left( a \otimes b \right)^\* \left( a \otimes b \right) \right\| = \left\| a^\* a \otimes b^\* b \right\| \le \left\| a^\* a \right\| \left\| b^\* b \right\| = \left\| a \right\|^2 \left\| b \right\|^2. \text{ (C.260)}$$

Eq. (C.259) then follows from the triangle inequality on the norm. -

If *A* and *B* each have a unit, there is a simpler proof: as in (C.254), we have

$$\|\|a \otimes b\|\| = \|\|(a \otimes 1\_B)(1\_A \otimes b)\|\| \le \|\|a \otimes 1\_B\|\| \|1\_A \otimes b\|\| = \|\|a\|\| \|b\|\|,\tag{C.261}$$

where we used *a*⊗1*B* = *a* etc., which is the case because the map *a* → *a*⊗1*<sup>B</sup>* from *<sup>A</sup>* to *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* is injective and hence is an (isometric) isomorphism onto its image.

We now prove Proposition C.95.

*Proof.* For any C\*-norm ·, we have *a* ⊗ *b*≥*a* ⊗ *b*min = *ab*, since the spatial norm is itself a cross-norm, cf. (C.256). Then (C.258) gives equality. -

In view of (C.259) and the existence of at least one C\*-norm on *A*⊗*B* (namely the spatial one), it makes sense to define the *maximal C\*-norm* on *A*⊗*B* by

$$\left\| \sum\_{i} a\_{i} \otimes b\_{i} \right\|\_{\max} = \sup \left\{ \left\| \sum\_{i} a\_{i} \otimes b\_{i} \right\|, \left\| \cdot \right\| \text{ is a } \mathbb{C}^{\*}\text{-norm on } A \otimes B \right\}.\tag{C.262}$$

This is clearly a C\*-norm, and hence it is also a cross-norm. i.e.,

$$\|a \otimes b\|\_{\max} = \|a\| \|b\|. \tag{C.263}$$

This property may be proved without using the deep result that the spatial norm is the minimal one (which in turn led to Proposition C.95); all we need is the inequality

$$||c||\_{\min} \le ||c||\_{\max},\tag{C.264}$$

for any *c* ∈ *A*⊗*B*, which follows from the definition of ·max, upon which (C.264) may be proved in the same way as for general C\*-norms. The completion *<sup>A</sup>*⊗<sup>ˆ</sup> max*<sup>B</sup>* of *A*⊗*B* in the norm ·max is called the *projective* tensor product of *A* and *B*.

If we define representations of the pre-C\*-algebra *A*⊗*B* on Hilbert spaces in the same way as for C\*-algebras, i.e., as linear maps π : *A* ⊗ *B* → *B*(*H*) that preserve the product (C.250) and the involution (C.251), we obtain

$$||c||\_{\max} = \sup\{||\pi(c)||\},\tag{C.265}$$

where *c* = ∑*<sup>i</sup> ai*⊗*bi* ∈ *A*⊗*B*, and π runs through all representations of *A*⊗*B*. Indeed, according to Theorem C.87 there exists an injective representation <sup>π</sup> of *<sup>A</sup>*⊗<sup>ˆ</sup> max*B*, so that *c*max <sup>=</sup> π(*c*) for each *<sup>c</sup>* <sup>∈</sup> *<sup>A</sup>*⊗<sup>ˆ</sup> max*B*, and hence also of each *<sup>c</sup>* <sup>∈</sup> *<sup>A</sup>* <sup>⊗</sup> *B*. Furthermore, any representation of *A* ⊗ *B* yields a cross-norm, so that (C.265) follows. This also shows that the supremum in (C.265) is actually attained.

In what follows, we restrict ourselves to the case that *A* and *B* have a unit, which suffices for our applications, but the claim is true in general (with a slightly more complicated proof, involving either approximate units or unitizations). If *A* and *B* each have a unit, so does *A* ⊗ *B*, viz. 1*<sup>A</sup>* ⊗ 1*B*. *States* ω on *A* ⊗ *B* are then defined as for unital C\*-algebras, i.e., as positive linear functionals (in the usual sense that ω(*c*∗*c*) ≥ 0 for any *c* ∈ *A*⊗*B*) that map the unit 1*<sup>A</sup>* ⊗*AB* of *A*⊗*B* to 1.

Proposition C.97. *Let A and B be unital. Then each state on A* ⊗ *B is continuous with respect to the* ·max*-norm, and hence extends to a state on the maximal tensor product A*⊗<sup>ˆ</sup> max*B. Thus identifying states on A*⊗*B and on A*⊗<sup>ˆ</sup> max*B, we have*

$$S(A \otimes B) = S(A \hat{\otimes}\_{\text{max}} B). \tag{C.266}$$

*Proof.* Let ω : *A* ⊗ *B* → C a state. Although *A* ⊗ *B* may not be a C\*-algebra, the GNS-construction Theorem C.88 goes through as if it were. The reason is that the only delicate point, namely boundedness of πω(*a*⊗*b*), may be proved from (C.94), just as in the usual case. Indeed, for *a* ∈ *A*, *b* ∈ *B*, and *c* ∈ *A*⊗*B*, we estimate

$$\begin{aligned} \|\mathfrak{m}\_{\mathfrak{o}}(a \otimes b)c\_{\mathfrak{o}}\|^2 &= \mathfrak{o}(c^\*(a \otimes b)^\*(a \otimes b)c) = \mathfrak{o}(c^\*(a^\*a \otimes b^\*b)c) \\ &\le \|a\|^2 \|b\|^2 \mathfrak{o}(c^\*c) = \|a\|^2 \|b\|^2 \|c\_{\mathfrak{o}}\|^2 \\ &= \|a \otimes b\|\_{\max} \|c\_{\mathfrak{o}}\|^2, \end{aligned}$$

so that πω(*a* ⊗ *b*)≤*a* ⊗ *b*max, and hence πω(*a* ⊗ *b*) may be extended to the completion *H*<sup>ω</sup> of (*A*⊗*B*)/*N*<sup>ω</sup> by continuity. Here we used the facts that:


Wriiting Ωω = (1*A*⊗1*B*)<sup>ω</sup> for the cyclic vector of *H*ω, as in (C.208), for any element *c* ∈ *A*⊗*B* we obtain, using (C.265) in the final inequality, the decisive bound

$$|\mathfrak{o}(c)| = |\langle \mathfrak{Q}\_{\mathfrak{o}o}, \mathfrak{a}\_{\mathfrak{o}o}(c)\mathfrak{Q}\_{\mathfrak{o}o} \rangle| \le \||\mathfrak{a}\_{\mathfrak{o}o}(c)|| \le ||c||\_{\max}.\tag{C.267}$$

In other words, ω is continuous with respect to the ·max-norm, and since the latter is dense in *<sup>A</sup>*⊗<sup>ˆ</sup> max*B*, the state extends to the completed tensor product by continuity. It follows from (C.267) and ω(1*<sup>A</sup>* ⊗1*B*) = 1 that ω = 1 as a functional on *A*⊗*B* equipped with the ·max-norm, so that the in question extension has the same norm, and hence by Proposition C.5 is a state on *<sup>A</sup>*⊗<sup>ˆ</sup> max*B*. Conversely, a state on *A* ⊗max *B* restricts to a state on *A* ⊗ *B*, since the two <sup>∗</sup>-algebras have the same unit and (trivially) if *c* is positive in the latter, then so it is in the former. -

The above proposition concerns extensions of *arbitrary* states on *A*⊗*B*. However, *product states* on *<sup>A</sup>*⊗*<sup>B</sup>* can be extended to any completed tensor product *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>*.

Proposition C.98. *If* ω*<sup>A</sup> and* ω*<sup>B</sup> are states on A and B, respectively, then the corresponding product state* ω*<sup>A</sup>* ⊗ω*<sup>B</sup> on A*⊗*B, defined as in* (C.243) *by*

$$a\mathbf{o}\_{\mathcal{A}} \otimes \mathbf{o}\_{\mathcal{B}} \left(\sum\_{i} a\_{i} \otimes b\_{i}\right) = \sum\_{i} \mathbf{o}\_{\mathcal{A}}(a\_{i}) \mathbf{o}\_{\mathcal{B}}(b\_{i}),\tag{C.268}$$

*is continuous with respect to any cross-norm* ·*, and hence extends to A*⊗<sup>ˆ</sup> *B.*

*Proof.* Since the spatial norm is minimal among all cross-norms, it is enough to prove continuity with respect to ·min. As in the proof of Proposition C.97, we form the GNS-representation πω*A*⊗ω*<sup>B</sup>* induced by ω*<sup>A</sup>* ⊗ω*B*, so that for any *c* ∈ *A*⊗*B*,

$$(\mathfrak{a}\_{\mathsf{A}} \otimes \mathfrak{a}\_{\mathsf{B}})(c) = \langle \mathfrak{Q}\_{\mathfrak{a}\_{\mathsf{A}} \otimes \mathfrak{a}\_{\mathsf{B}}}, \pi\_{\mathfrak{a}\_{\mathsf{A}} \otimes \mathfrak{a}\_{\mathsf{B}}}(c) \mathfrak{Q}\_{\mathfrak{a}\_{\mathsf{A}} \otimes \mathfrak{a}\_{\mathsf{B}}} \rangle. \tag{C.269}$$

Now consider the representation πω*<sup>A</sup>* (*A*)⊗πω*<sup>B</sup>* (*B*) on *H*ω*<sup>A</sup>* ⊗*H*ω*<sup>B</sup>* , with cyclic vector Ωω*<sup>A</sup>* ⊗Ωω*<sup>B</sup>* . Writing *c* = ∑*<sup>i</sup> ai* ⊗*bi* as usual, a simple computation gives

$$\begin{split} \langle \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{A}}} \otimes \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{B}}}, (\mathfrak{\pi}\_{\mathfrak{Q}\_{\mathbb{A}}} \otimes \mathfrak{\pi}\_{\mathfrak{Q}\_{\mathbb{B}}}) (c) \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{A}}} \otimes \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{B}}} \rangle \\ = \sum\_{i} \langle \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{A}}}, \mathfrak{\pi}\_{\mathfrak{Q}\_{\mathbb{A}}}(a\_{i}) \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{B}}} \rangle \langle \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{B}}}, \mathfrak{\pi}\_{\mathfrak{Q}\_{\mathbb{B}}}(b\_{i}) \mathfrak{Q}\_{\mathfrak{Q}\_{\mathbb{B}}} \rangle = \sum\_{i} \mathfrak{Q}\_{\mathbb{A}}(a\_{i}) \mathfrak{q}\_{\mathfrak{B}}(b\_{i}) \\ = (\mathfrak{Q}\_{\mathbb{A}} \otimes \mathfrak{Q}\_{\mathbb{B}})(c). \end{split} \tag{C.270}$$

Using the same reasoning as in (the proof of) Proposition C.91 (which does not apply literally, since it is about C\*-algebras), it follows from (C.270) that πω*A*⊗ω*<sup>B</sup>* (*A*⊗ *B*) is unitarily equivalent to πω*<sup>A</sup>* (*A*)⊗πω*<sup>B</sup>* (*B*), so that, using (C.270), analogously to (C.267) but this time using (C.257) at the end, we have

$$|(\mathfrak{o}\_{\mathsf{A}} \otimes \mathfrak{o}\_{\mathsf{B}})(c)| \le ||\mathfrak{x}\_{\mathsf{o}\mathsf{A}} \otimes \mathfrak{x}\_{\mathsf{o}\mathsf{B}}(c)|| \le ||c||\_{\min}.\tag{7}$$

As an application, analogously to (C.248), we show that:

Proposition C.99. *For any locally compact Hausdorff spaces X*,*Y and any crossnorm on C*0(*X*)⊗*C*0(*Y*)*, with completed tensor product C*0(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*0(*Y*)*, we have*

$$\mathcal{C}\_0(X) \hat{\otimes} \mathcal{C}\_0(Y) \cong \mathcal{C}\_0(X \times Y),\tag{C.271}$$

*under the isomorphism given by continuous extension of the map f* ⊗ *g* → *f g* : (*x*, *y*) → *f*(*x*)*g*(*y*) *from the algebraic tensor product C*0(*X*)⊗*C*0(*Y*) *to C*0(*X* ×*Y*)*.*

*Proof.* We just prove the unital case, where *X* and *Y* are compact.

Let *x* ∈ *X* and *y* ∈ *Y*, and take the corresponding evaluations maps ev*<sup>x</sup>* and ev*<sup>y</sup>* on *C*(*X*) and *C*(*Y*), respectively. These are multiplicative states, cf. Proposition C.19. Then ev*<sup>x</sup>* ⊗ ev*<sup>x</sup>* is a nonzero multiplicative state on *C*(*X*) ⊗*C*(*Y*), and hence also on *<sup>C</sup>*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*), cf. Proposition C.98. This gives an injection of *<sup>X</sup>* <sup>×</sup>*<sup>Y</sup>* into <sup>Σ</sup>(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)), i.e., the Gelfand spectrum of *<sup>C</sup>*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*), cf. §C.2.

Conversely, the restriction <sup>ω</sup><sup>1</sup> of any <sup>ω</sup> <sup>∈</sup> <sup>Σ</sup>(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)) to *<sup>C</sup>*(*X*), given by ω1(*f*) = ω(*f* ⊗ 1*<sup>Y</sup>* ), is multiplicative, as is the restriction ω<sup>2</sup> of ω to *C*(*Y*), defined by ω2(*g*) = ω(1*<sup>X</sup>* ⊗ *g*). Then ω = ω<sup>1</sup> ⊗ ω2, with ensuing injective map <sup>Σ</sup>(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)) <sup>→</sup> *<sup>X</sup>* <sup>×</sup>*Y*. Thus the above injection is also a surjection, and hence a bijection, which is easily seen to be a homeomorphism. -

This can also be proved without Proposition C.98, using only the second step: if <sup>Σ</sup>(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)) <sup>=</sup> *<sup>X</sup>* <sup>×</sup>*Y*, then, since <sup>Σ</sup>(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)) is closed in *<sup>X</sup>* <sup>×</sup>*Y*, there are nonempty opens *<sup>U</sup>* <sup>⊂</sup> *<sup>X</sup>* and *<sup>V</sup>* <sup>⊂</sup> *<sup>Y</sup>* such that (*<sup>U</sup>* <sup>×</sup>*V*)∩Σ(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)) = 0. Now / take nonzero functions *f* ∈ *Cc*(*U*) and *g* ∈ *Cc*(*V*) such that ω(*f* ⊗ *g*) = 0 for all <sup>ω</sup> <sup>∈</sup> <sup>Σ</sup>(*C*(*X*)⊗<sup>ˆ</sup> *<sup>C</sup>*(*Y*)). This contradicts the isometry (C.18) of the Gelfand transform.

Proposition C.100. *For any locally compact Hausdorff space X and any C\*-algebra B, let C*0(*X*,*B*) *be the C\*-algebra of all continuous functions* ˜*<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> *B for which the function x* → ˜*f*(*x*)*<sup>B</sup> is in C*0(*X*)*, equipped with the supremum norm*

$$\|\tilde{f}\| = \sup\{\|\tilde{f}(\mathbf{x})\|\_{\mathcal{B}}, \mathbf{x} \in X\}.\tag{\text{C.272}}$$

*For any C\*-norm with ensuing tensor product* <sup>⊗</sup><sup>ˆ</sup> *, one then has*

$$C\_0(X) \hat{\otimes} B \cong C\_0(X, B),\tag{C.273}$$

*under continuous extension of the map from C*0(*X*)⊗*B to C*0(*X*,*B*) *defined by*

$$f \otimes b \mapsto (fb : \mathfrak{x} \mapsto f(\mathfrak{x})b). \tag{C.274}$$

We just prove this for the minimal (i.e. spatial) C\*-norm; the general case follows from nuclearity of *C*0(*X*), cf. Proposition C.101 below.

*Proof.* Take some injective representation π*<sup>B</sup>* : *B* → *B*(*HB*), and represent *C*0(*X*,*B*) on -<sup>2</sup>(*X*)⊗*HB* by linear extension of <sup>π</sup> : *<sup>C</sup>*0(*X*,*B*) <sup>→</sup> *<sup>B</sup>*(-<sup>2</sup>(*X*)⊗*HB*), as defined by

$$
\pi(\tilde{f})\mathfrak{d}\_{\mathfrak{x}} \otimes \mathfrak{q} = \mathfrak{d}\_{\mathfrak{x}} \otimes \mathfrak{a}\_{\mathfrak{B}}(f(\tilde{\mathfrak{x}})) \mathfrak{q}, \tag{\text{C.275}}
$$

where ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*X*,*B*), *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*, and <sup>ϕ</sup> <sup>∈</sup> *HB*; this operator is easily seen to be bounded. In particular, an element *f b* ∈ *C*0(*X*,*B*), as in (C.274), is represented by

$$
\pi(fb)(\delta\_{\mathfrak{x}} \otimes \mathfrak{q}) = f(\mathfrak{x})\delta\_{\mathfrak{x}} \otimes \pi\_{\mathfrak{B}}(b)\mathfrak{q}.\tag{C.276}
$$

Denoting the representation of *C*0(*X*) on -<sup>2</sup>(*X*) through multiplication operators by π*m*, i.e., π*m*(*f*)ψ(*x*) = *f*(*x*)ψ(*x*), where *f* ∈ *C*0(*X*) and ψ ∈ -<sup>2</sup>(*X*), we then have

$$
\pi\_m \otimes \pi\_\mathbb{B}(f \otimes b) = \pi(fb). \tag{C.277}
$$

In this way, *C*0(*X*)⊗*B* is faithfully represented as a subalgebra of

$$
\pi(C\_0(X,B)) \cong C\_0(X,B),\tag{C.278}
$$

and so the final step is merely to show that *<sup>C</sup>*0(*X*)⊗<sup>ˆ</sup> *<sup>B</sup>* is dense in *<sup>C</sup>*0(*X*,*B*). Indeed, taking *X* compact for simplicity (otherwise one needs a further approximation argument), for given ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*,*B*) and <sup>ε</sup> <sup>&</sup>gt; 0, define a cover <sup>U</sup> = (*Ux*)*x*∈*<sup>X</sup>* of *<sup>X</sup>* by

$$U\_{\mathbf{x}} = \{ \mathbf{y} \in X \mid \|f(\mathbf{x}) - f(\mathbf{y})\| < \varepsilon \}. \tag{C.279}$$

Since *X* is compact, U has a finite subcover {*Ux*<sup>1</sup> ,...,*Uxn* }, with associated *partition of unity* {*gx*<sup>1</sup> ,...,*gxn* }, i.e., one has *gxi* ∈ *Cc*(*Uxi* ), with 0 ≤ *gxi* ≤ 1, and

$$\sum\_{i=1}^{n} \mathbf{g}\_{x\_i}(\mathbf{x}) = 1 \ (\mathbf{x} \in X). \tag{C.280}$$

Define an approximant *g* ∈ *C*(*X*)⊗*B* by

$$\mathbf{g}(\mathbf{x}) = \sum\_{i} \mathbf{g}\_{\mathbf{x}\_{i}} \otimes \tilde{f}(\mathbf{x}\_{i}),\tag{C.281}$$

whose image ˜*g* ∈ *C*(*X*,*B*) is given by ˜*g*(*x*) = ∑*<sup>i</sup> gxi* (*x*) ˜*f*(*xi*). Then for each *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*,

$$\|\|\tilde{g}(\mathbf{x}) - \tilde{f}(\mathbf{x})\|\|\_{B} = \left\|\sum\_{l} g\_{\mathbf{x}\_{l}}(\mathbf{x})(\tilde{f}(\mathbf{x}\_{l}) - \tilde{f}(\mathbf{x}))\right\|\Big|\_{B} < \sum\_{l} g\_{\mathbf{x}\_{l}}(\mathbf{x}) \cdot \mathbf{c} = \mathbf{c},\tag{C.282}$$

so that, taking sup*x*, we have *g*˜<sup>−</sup> ˜*<sup>f</sup>* <sup>&</sup>lt; <sup>ε</sup>. This proves the claim. -

Since *<sup>C</sup>*0(*<sup>X</sup>* <sup>×</sup>*Y*) <sup>∼</sup>=*C*0(*X*,*C*0(*Y*)) under the map *<sup>f</sup>* → ˜*<sup>f</sup>* with *<sup>f</sup>*(*x*, *<sup>y</sup>*)=( ˜*f*(*x*))(*y*), the isomorphism (C.271) is a special case of (C.273).

Another case where the choice of a cross-norm does not matter—this time because no completion is even needed—is the following. Recall Corollary C.28.

Proposition C.101. *Let A be a* finite-dimensional *C\*-algebra. Then for any C\* algebra B, A*⊗*B is complete in any C\*-norm, and hence all C\*-norms coincide.*

Thus *<sup>A</sup>*⊗<sup>ˆ</sup> *<sup>B</sup>* <sup>=</sup> *<sup>A</sup>*⊗*B*, though one still needs a norm on *<sup>A</sup>*⊗*<sup>B</sup>* to make it a C\*-algebra!

*Proof.* In view of Theorem C.163, we only need to prove this for *A* = *Mn*(C), *n* ∈ N. As in the previous proof, we use the spatial tensor product on *Mn*(C)⊗*B*, so let us faithfully represent *Mn*(C) and *B* on C*<sup>n</sup>* and *HB*, respectively, and form the Hilbert space <sup>C</sup>*n*⊗*HB* <sup>=</sup> <sup>C</sup>*<sup>n</sup>* <sup>⊗</sup> *HB*, carrying the representation id <sup>⊗</sup> <sup>π</sup>*<sup>B</sup>* of *Mn*(C) <sup>⊗</sup> *<sup>B</sup>*, and hence of the (alleged) completion *Mn*(C)⊗<sup>ˆ</sup> min*B*. Let

$$c = \sum\_{i,j=1}^{n} e\_{ij} b^{ij} \in \mathcal{M}\_n(\mathbb{C}) \otimes \mathcal{B},\tag{\text{C.283}}$$

where (*ei j*) is the standard basis of *Mn*(C) and *<sup>b</sup>i j* <sup>∈</sup> *<sup>B</sup>*. For any such *<sup>c</sup>*, we have

706 C Operator algebras

$$\|\|c\|\|\_{\min}^2 \ge \left\|\sum\_{i,j=1}^n e\_{ij} b^{ij} (\mathfrak{w}\_k \otimes \mathfrak{q})\right\|\_{\mathbf{C}^u \odot H\_B}^2 = \sum\_i \|b^{ik}\mathfrak{q}\|\_{H\_B}^2,\tag{\text{C.284}}$$

where (υ1,...υ*n*) is the standard basis of <sup>C</sup>*n*, *<sup>k</sup>* <sup>=</sup> <sup>1</sup>,...,*<sup>n</sup>* is fixed, and <sup>ϕ</sup> <sup>∈</sup> *HB* is a unit vector. Taking the supremum over ϕ gives

$$\left\| \sum\_{\ell,j} e\_{ij} b^{ij} \right\|\_{\text{min}} \ge \||b^{ij}||\_B,\tag{C.285}$$

for each fixed pair (*i*, *j*). Hence any Cauchy sequence (*ck*) in *Mn*(C)⊗ *B* takes the form *ck* <sup>=</sup> <sup>∑</sup>*i*, *<sup>j</sup>*=<sup>1</sup> *ei jbi j <sup>k</sup>* , where each (*bi j <sup>k</sup>* ) is a Cauchy sequence in *B* for fixed (*i*, *j*). Then, using the fact that *ei jMn*(C) = 1, we have

$$||c - c\_{k}||\_{\min} = \left|| \sum\_{i,j} e\_{lj} (b^{ij} - b\_{k}^{lj}) \right||\_{\min} \le \sum\_{i,j=1} ||b^{ij} - b\_{k}^{ij}||\_{B},\tag{C.286}$$

for any *<sup>c</sup>* <sup>∈</sup> *Mn*(C) <sup>⊗</sup> *<sup>B</sup>*, as in (C.283). Taking *<sup>c</sup>* such that *<sup>b</sup>i j* <sup>=</sup> lim*<sup>k</sup> bi j <sup>k</sup>* , it follows that *ck* <sup>→</sup> *<sup>c</sup>* in ·min, i.e., in *Mn*(C)⊗<sup>ˆ</sup> min*B*. In particular, the limit *<sup>c</sup>* of any Cauchy sequence in *Mn*(C) ⊗ *B* with respect to the norm ·min lies in *Mn*(C) ⊗ *B*, which is therefore complete already and is a C\*-algebra in the spatial norm. Since the norm in a C\*-algebra is unique (cf. Corollary C.28), it follows that any C\*-norm on *Mn*(C)⊗*B* must coincide with the spatial one ·min. -

It is also easy to show that

$$M\_n(\mathbb{C}) \otimes B \cong M\_n(B),\tag{C.287}$$

i.e., the *n*×*n*-matrices with entries in *B*, with obvious operations and norm given by faithfully representing *B* on some Hilbert space *HB*, as above, and then letting *Mn*(*B*) act on *H<sup>n</sup> <sup>B</sup>* = *HB*⊕···⊕*HB* (i.e., *n* copies) in the natural way. A specific isomorphism *Mn*(C)⊗*<sup>B</sup>* <sup>→</sup> *Mn*(*B*) is then given by sending <sup>∑</sup>*<sup>n</sup> <sup>i</sup>*, *<sup>j</sup>*=<sup>1</sup> *ei jbi j* to the matrix (*bi j*).

Finally, one of the highlights of the theory of tensor products on *A*⊗*B* is a concept that apparently makes the entire theory superfluous:

Definition C.102. *A C\*-algebra A is called* nuclear *if for any C\*-algebra B, the norms* ·min *and* ·max *(and consequently all C\*-norms) on A*⊗*B coincide.*

The class of nuclear C\*-algebras is large but not exhaustive: if *H* is infinitedimensional, then *B*0(*H*) is nuclear but *B*(*H*) is not, even if *H* is separable. However:


#### C.14 Inductive limits and infinite tensor products of C\*-algebras

In the main text we deal with infinite quantum systems, albeit as idealizations rather than physical systems that exist in reality. Mathematically, such systems arise as infinite tensor products of C\*-algebras, which in turn are special cases of *inductive limits*, also called *direct limits* (categorically, these are *colimits*, see §E.1 below, and as such they are unique op to isomorphism—in this case, of C\*-algebras).

Let *I* be a directed set (cf. Definition D.1), typically *I* = N with the usual order. Let (*Ai*) a family of C\*-algebras indexed by *I*; in case that *I* = N, these will often be

$$A\_n = \mathcal{B}^n \equiv \hat{\otimes}\_{\text{max}}^n \mathcal{B},\tag{C.288}$$

where *<sup>B</sup>* is some C\*-algebra and <sup>⊗</sup><sup>ˆ</sup> max is the projective tensor product, extended from two C\*-algebras (as discussed in the previous section) to any finite number of C\*-algebras in the obvious way: for any completed C\*-tensor product <sup>⊗</sup><sup>ˆ</sup> , *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>, and C\*-algebras (*C*1,...,*Cn*), we inductively define the tensor product of the latter as

$$\mathbf{C}\_1 \otimes \dots \otimes \mathbf{C}\_n = (\mathbf{C}\_1 \otimes \dots \otimes \mathbf{C}\_{n-1}) \otimes \mathbf{C}\_n. \tag{\text{C.289}}$$

In general, the cartesian product ∏*i*∈*<sup>I</sup> Ai* consists of all functions *a* : *I* → ∪*iAi* such that *a*(*i*) ≡ *ai* ∈ *Ai*; we often write such functions as (*ai*)*i*, where *ai* ∈ *Ai*. The Axiom of Choice then guarantees (or, following Russell, even states) that—provided none of the *Ai* is empty—the set ∏*i*∈*<sup>I</sup> Ai* is non-empty. Since each *Ai* is a <sup>∗</sup>-algebra, we can turn ∏*i*∈*<sup>I</sup> Ai* into a <sup>∗</sup>-algebra in the obvious way, i.e., by defining scalar multiplication as (λ · *a*)(*i*) = λ*a*(*i*), with pointwise addition, multiplication, and involution. This <sup>∗</sup>-algebra, denoted by ⊕*iAi*, is the *algebraic direct sum* of the *Ai*.

What about the norm? There are various options here, each relying on the choice of some subspace of ⊕*iAi*. For example, if *A*<sup>0</sup> consists of all *a* ∈ ∏*i*∈*<sup>I</sup> Ai* for which lim*<sup>i</sup> ai* <sup>=</sup> 0, then the *algebraic direct sum* <sup>⊕</sup><sup>ˆ</sup> *iAi* of the *Ai* is *<sup>A</sup>*0, with norm

$$\|\|a\|\| = \sup\_{i} \|\|a\_{i}\|\|. \tag{C.290}$$

For the inductive limit we need additional structure, namely a family of homomorphisms ϕ*i j* : *Ai* → *Aj*, defined for each *i* ≤ *j* in *I*, such that for each *i* ≤ *j* ≤ *k*,

$$
\mathfrak{g}\_{il} = \mathrm{id}\_{A\_l}; \tag{\text{C.291}}
$$

$$
\mathfrak{q}\_{jk} \circ \mathfrak{q}\_{ij} = \mathfrak{q}\_{ik}.\tag{C.292}
$$

Such maps turn the family (*Ai*) into a so-called *directed system* of C\*-algebras. For example, in case of (C.288), and assuming *B* has a unit 1*<sup>B</sup>* (otherwise there are analogous constructions based on projections), for *<sup>n</sup>* <sup>&</sup>lt; *<sup>m</sup>*, define <sup>ϕ</sup>*nm* : *Bn* <sup>→</sup> *<sup>B</sup><sup>m</sup>* by

$$
\mathfrak{q}\_{un}(b) = b \otimes \mathbf{1}\_B \otimes \cdots \otimes \mathbf{1}\_B. \tag{C.293}
$$

with *m* − *n* units 1*B*. This can be done also in the more general situation (C.289), where we assume each *Ci* to be unital with unit 1*i*, and define

708 C Operator algebras

$$A\_n = \hat{\otimes}\_{i=1}^n \mathbf{C}\_i;\tag{\text{C.294}}$$

$$\mathfrak{q}\_{nm}(c) = c \otimes \mathbf{1}\_{\mathbb{C}\_{n+1}} \otimes \cdots \otimes \mathbf{1}\_{\mathbb{C}\_{m}}.\tag{\mathbb{C}.295}$$

As a matter of central importance to the theory of quantum spin systems. one may generalize this construction in allowing more general directed sets, whilst specializing it in picking very specific C\*-algebras *Ci*. Let <sup>Z</sup>*<sup>d</sup>* <sup>⊂</sup> <sup>R</sup>*<sup>d</sup>* be the standard lattice in spatial dimension *d*, and let *I* be the set of of all finite subsets Λ of Z*<sup>d</sup>* (so one typically writes Λ instead of *i*). Furthermore, take some fixed Hilbert space *H*, assumed finite-dimensional for simplicity (this also suffices for most applications to quantum statistical mechanics), and for each Λ ∈ *I*, define the *cartesian* product

$$H^{\Lambda} = \prod\_{\chi \in \Lambda} H\_{\chi},\tag{C.296}$$

where *Hx* <sup>=</sup> *<sup>H</sup>* for each *<sup>x</sup>*. Thus elements <sup>ψ</sup> : <sup>Λ</sup> <sup>→</sup> *<sup>H</sup>* of *<sup>H</sup>*<sup>Λ</sup> are families (ψ*x*)*x*∈<sup>Λ</sup> , where ψ*<sup>x</sup>* ∈ *H*. To define the *tensor* product

$$H\_{\Lambda} = \otimes\_{\mathfrak{x} \in \Lambda} H\_{\mathfrak{x}},\tag{\text{C.297}}$$

we generalize the procedure explained between (C.245) and (C.246) in the previous section. If dim(*HA*) < ∞ and dim(*HB*) < ∞, the injection

$$H\_A \otimes H\_B \hookrightarrow \overline{L}(H\_A \times H\_B, \mathbb{C}),\tag{\text{C.298}}$$

is an isomorphism, and we use this fact (with *HA* = *HB* = *H*) to *define H*<sup>Λ</sup> as *<sup>L</sup>*(*H*<sup>Λ</sup> ,C), that is, the set of all anti-multi-linear maps <sup>ψ</sup><sup>ˆ</sup> : *<sup>H</sup>*<sup>Λ</sup> <sup>→</sup> <sup>C</sup>, equipped with pointwise operations turning it into a complex vector space. Each element <sup>ψ</sup> : <sup>Λ</sup> <sup>→</sup> *<sup>H</sup>* of *<sup>H</sup>*<sup>Λ</sup> itself defines such a map <sup>ψ</sup><sup>ˆ</sup> <sup>∈</sup> *<sup>L</sup>*(*H*<sup>Λ</sup> ,C) via

$$
\hat{\Psi}(\boldsymbol{\varphi}) = \prod\_{\mathbf{x} \in \Lambda} \langle \boldsymbol{\varphi}\_{\mathbf{x}}, \boldsymbol{\Psi}\_{\mathbf{x}} \rangle\_{H}, \tag{C.299}
$$

through which the inner product on *H*<sup>Λ</sup> is defined by linear extension of

$$
\langle \Psi, \Phi \rangle\_{H\_{\Lambda}} = \prod\_{\mathbf{x} \in \Lambda} \langle \Psi\_{\mathbf{x}}, \Phi\_{\mathbf{x}} \rangle\_{H}. \tag{C.300}
$$

In this realization of *H*<sup>Λ</sup> , the elementary tensors ⊗*x*∈<sup>Λ</sup> ψ*<sup>x</sup>* ∈ *H*<sup>Λ</sup> coincide with the above elements <sup>ψ</sup><sup>ˆ</sup> <sup>∈</sup> *<sup>L</sup>*(*H*<sup>Λ</sup> ,C) <sup>≡</sup> *<sup>H</sup>*<sup>Λ</sup> . Furthermore, if (υ1,...,υ*n*) is a basis of *<sup>H</sup>* <sup>∼</sup><sup>=</sup> <sup>C</sup>*n*, then (⊗*x*∈<sup>Λ</sup> <sup>υ</sup>*s*(*<sup>x</sup>*)) is a basis of *<sup>H</sup>*<sup>Λ</sup> , where *<sup>s</sup>* : <sup>Λ</sup> → {1,...,*n*}. Hence

$$\dim(H\_{\Lambda}) = \dim(H)^{|\Lambda|}. \tag{C.301}$$

Furthermore, writing *<sup>n</sup>* <sup>=</sup> {1,2,...,*n*}, and letting *<sup>n</sup>*<sup>Λ</sup> be the set of maps ("classical spin configurations") *s* : Λ → *n*, there is a natural unitary isomorphism

$$H\_{\Lambda} \cong \ell^2(\underline{\mathfrak{u}}^{\Lambda}).\tag{\text{C.302}}$$

Indeed, as the functions δ*<sup>s</sup>* : *t* → δ*st* form a basis of -2, the map <sup>δ</sup>*<sup>s</sup>* → ⊗*x*∈<sup>Λ</sup> <sup>υ</sup>*s*(*x*) extends to a unitary from -<sup>2</sup>(*n*<sup>Λ</sup> ) to *H*<sup>Λ</sup> . Under this equivalence, elements of *H*<sup>Λ</sup> may be interpreted as "wave-functions" whose argument is a spin configuration.

Returning to C\*-algebras, having defined *H*<sup>Λ</sup> , we now put

$$A\_{\Lambda} = B(H\_{\Lambda}).\tag{C.303}$$

To fit this into the above framework, we note that the partial order ≤ on *I* is given by Λ ≤ Λ whenever Λ ⊆ Λ , in which case there is a canonical embedding

$$\mathfrak{a}\_{\Lambda\Lambda'}: \mathcal{A}\_{\Lambda} \hookrightarrow \mathcal{A}\_{\Lambda'}.\tag{C.304}$$

This embedding is given as in (C.293), i.e., by adding unit operators. Let Λ ⊂ Λ and define Λ = Λ \Λ. We may split ψ : Λ → *H* as ψ → (ψ <sup>|</sup><sup>Λ</sup> ,<sup>ψ</sup> |<sup>Λ</sup>), from which

$$H^{\Lambda'} \cong H^{\Lambda} \times H^{\Lambda''}.\tag{C.305}$$

As in (C.298), this gives isomorphisms

$$H\_{\Lambda'} = \overline{\mathcal{L}}(H^{\Lambda'}, \mathbb{C}) \cong \overline{\mathcal{L}}(H^{\Lambda} \times H^{\Lambda''}, \mathbb{C}) \cong \overline{\mathcal{L}}(H^{\Lambda} \otimes H^{\Lambda''}, \mathbb{C}) \cong H\_{\Lambda} \otimes H\_{\Lambda''}. \tag{C.306}$$

This, in turn, induces an isomorphism

$$A\_{\Lambda'} = B(H\_{\Lambda'}) \cong B(H\_{\Lambda} \otimes H\_{\Lambda''}) \cong B(H\_{\Lambda}) \otimes B(H\_{\Lambda''}) = A\_{\Lambda} \otimes A\_{\Lambda''},\tag{C.307}$$

which, through the embedding

$$B(H\_{\Lambda}) \hookrightarrow B(H\_{\Lambda}) \otimes B(H\_{\Lambda''});\tag{C.308}$$

$$a \mapsto a \otimes \mathbb{I}\_{B(H\_{\Lambda''})},\tag{C.309}$$

gives an embedding *B*(*H*<sup>Λ</sup> ) → *B*(*H*<sup>Λ</sup>). This, then, is the injection (C.304).

Alternatively, *B*(*H*<sup>Λ</sup> ) may be constructed just like *H*<sup>Λ</sup> itself, i.e., by starting with the set *<sup>B</sup>*(*H*)<sup>Λ</sup> of functions *<sup>a</sup>* : <sup>Λ</sup> <sup>→</sup> *<sup>B</sup>*(*H*). Any such *<sup>a</sup>* defines an operator ˆ*<sup>a</sup>* on *<sup>H</sup>*<sup>Λ</sup> by first defining its action on elementary tensors by ˆ*a*ψˆ = ⊗*x*∈<sup>Λ</sup> *ax*ψ*x*, and extending the result linearly to arbitrary vectors in *H*<sup>Λ</sup> . We write ˆ*a* = ⊗*x*∈<sup>Λ</sup> *ax*, and reconstruct *B*(*H*<sup>Λ</sup> ) as the complex vector space spanned by all such elementary operators. The injection (C.304) is given by linear extension of the map ˆ*a* → *a*ˆ , where ˆ*a <sup>x</sup>* = *ax* whenever *x* = *x* ∈ Λ ⊂ Λ , and ˆ*a <sup>x</sup>* = 1*<sup>H</sup>* otherwise, i.e., if *x* ∈ Λ.

Either way, we obtain a directed system of C\*-algebras (*A*<sup>Λ</sup> ), where the finite subsets <sup>Λ</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* are partially ordered by inclusion, and the maps ϕΛΛ : *<sup>A</sup>*<sup>Λ</sup> <sup>→</sup> *<sup>A</sup>*<sup>Λ</sup> , with properties like (C.291) - (C.292), are given by the inclusions (C.304).

There is a classical counterpart to this construction, in which the local C\* algebras are given by "functions of functions", i.e.,

$$A\_{\Lambda}^{(c)} = C(\underline{\eta}^{\Lambda}) = C(C(\Lambda, \underline{\eta})).\tag{C.310}$$

Since *n*<sup>Λ</sup> is a finite discrete set, any function on it is continuous (and lies in -2, etc.). If Λ ⊆ Λ , then, *<sup>s</sup>* <sup>∈</sup> *<sup>n</sup>*<sup>Λ</sup> being a map *s* : Λ → *n*, the connecting homomorphisms

$$\mathfrak{a}^{(c)}\_{\Lambda\Lambda'} : A^{(c)}\_{\Lambda} \hookrightarrow A^{(c)}\_{\Lambda'}, \tag{C.311}$$

are given quite canonically by

$$\mathfrak{a}^{(c)}\_{\Lambda\Lambda'}(f) : s' \mapsto f(s'\_{|\Lambda}). \tag{C.312}$$

Note that *C*(*n*<sup>Λ</sup> ) = -<sup>2</sup>(*n*<sup>Λ</sup> ) as vector spaces, so that (C.311) also gives natural maps -<sup>2</sup>(*n*<sup>Λ</sup> ) <sup>→</sup> -<sup>2</sup>(*n*<sup>Λ</sup> ), and hence, via (C.302), *H*<sup>Λ</sup> → *H*<sup>Λ</sup> . These are given by linear extension of the map given on basis vectors by ⊗*x*∈<sup>Λ</sup> υ*s*(*x*) → ∑*<sup>s</sup>*:*<sup>s</sup>* <sup>|</sup><sup>Λ</sup> <sup>=</sup>*<sup>s</sup>* <sup>⊗</sup>*x*∈<sup>Λ</sup><sup>υ</sup>*<sup>s</sup>*(*<sup>x</sup>*).

Furthermore, analogously tot (C.307), since Λ = Λ ∪Λ is finite, we have

$$A\_{\Lambda'}^{(c)} = \mathcal{C}(\underline{\eta}^{\Lambda'}) = \mathcal{C}(\mathcal{C}(\Lambda', \underline{\eta})) = \mathcal{C}(\mathcal{C}(\Lambda \cup \Lambda'', \underline{\eta})) \cong \mathcal{C}(\mathcal{C}(\Lambda, \underline{\eta}) \times \mathcal{C}(\Lambda'', \underline{\eta}))$$

$$\cong \mathcal{C}(\mathcal{C}(\Lambda, \underline{\eta})) \otimes \mathcal{C}(\mathcal{C}(\Lambda'', \underline{\eta})) = \mathcal{C}(\underline{\eta}^{\Lambda}) \otimes \mathcal{C}(\underline{\eta}^{\Lambda''}) = A\_{\Lambda}^{(c)} \otimes A\_{\Lambda''}^{(c)}.\tag{C.313}$$

Given a directed system of C\*-algebras (*Ai*,ϕ*i j*), we define the *local part A*loc of ∏*<sup>i</sup> Ai* as the set of all elements *a* = (*ai*) of ∏*<sup>i</sup> Ai* for which there is *i*<sup>0</sup> ∈ *I* (depending on *a*) such that *ai* = ϕ*i*0*i*(*ai*<sup>0</sup> ) whenever *i*<sup>0</sup> ≤ *i*. This is equivalent to the seemingly stronger condition that *aj* = ϕ*i j*(*ai*) whenever *i*<sup>0</sup> ≤ *i* ≤ *j*, since

$$a\_j = \mathfrak{q}\_{l\_0 j}(a\_{l\_0}) = \mathfrak{q}\_{lj} \circ \mathfrak{q}\_{l\_0 i}(a\_{l\_0}) = \mathfrak{q}\_{lj}(a\_i). \tag{C.314}$$

In the example (C.288) with (C.293), this simply means that for each sequence (*an*)*n*∈N, there is *<sup>n</sup>*<sup>0</sup> <sup>∈</sup> <sup>N</sup> such that *an* <sup>=</sup> *an*<sup>0</sup> <sup>⊗</sup>*n*−*n*<sup>0</sup> <sup>1</sup>*<sup>B</sup>* for each *<sup>n</sup>* <sup>&</sup>gt; *<sup>n</sup>*0. Similarly, in the example (C.303) with (C.304), for each *a* = (*a*<sup>Λ</sup> ), where Λ is a finite subset of <sup>Z</sup>*<sup>d</sup>* and *<sup>a</sup>*<sup>Λ</sup> <sup>∈</sup> *<sup>A</sup>*<sup>Λ</sup> for each <sup>Λ</sup>, there is a finite subset <sup>Λ</sup><sup>0</sup> <sup>⊂</sup> <sup>Z</sup>*<sup>d</sup>* such that for any Λ ⊇ Λ<sup>0</sup> we have *a*<sup>Λ</sup> = ιΛ0<sup>Λ</sup> (*a*Λ<sup>0</sup> ). It is easy to see that *A*loc is a <sup>∗</sup>-algebra under the (pointwise) operations inherited from ∏*<sup>i</sup> Ai*. For each (*ai*) ∈ *A*loc, the norms *ai* form a net in <sup>R</sup>+. Recall that some net (*ti*)*i*∈*<sup>I</sup>* in <sup>R</sup> (which by definition is indexed by a directed set *I*) is said to *converge* to *t* ∈ R if for each ε > 0, there is *i* ∈ *I* such that |*t* −*tj*| < ε for all *j* ≥ *i* (since R is Hausdorff, any net in R converges to at most one point). Because the connecting maps ϕ*i j* are homomorphisms of C\*-algebras, they are norm-decreasing (cf. Theorem C.62.1), i.e., ϕ*i j*(*ai*)≤*ai*. Thus for any *a* ∈ *A*loc with associated *i*<sup>0</sup> ∈ *I*, the (sub)net (*ai*)*i*≥*i*<sup>0</sup> lies in the interval [0,*ai*<sup>0</sup> ], and is monotone decreasing in the sense that if *j* ≥ *i* ≥ *i*0, then *aj*≤*ai*. As for sequences (which are just nets indexed by *I* = R), *bounded* monotone decreasing (or increasing) nets in R converge, so that net (*ai*)*i*≥*i*<sup>0</sup> has a limit, and this also means that (*ai*)*<sup>i</sup>* has the same limit. Call this limit *a*0. The map *a* → *a*<sup>0</sup> generally fails to define a norm on *A*loc, since it may lack the property of positive definiteness, and even if it had it, the space would not be complete (at least if *I* is infinite, as we tacitly assume). We do have the C\*-axioms *ab*<sup>0</sup> ≤ *a*0*b*<sup>0</sup> and *a*∗*a*<sup>0</sup> <sup>=</sup> *a*<sup>2</sup> 0 though, since these hold for each norm *ai* → *ai* and are preserved in the limit.

So we say that *a*<sup>0</sup> is a *C\*-seminorm* on *A*loc, and there is a canonical procedure to turn a ∗-algebra with C\*-seminorm into a C\*-algebra:


$$A = \underline{\lim}\_{l} A\_{l},\tag{C.315}$$

called the *inductive limit* of the directed system (*Ai*,ϕ*i j*).

For each *i* ∈ *I*, we now define a canonical homomorphism ϕ*<sup>i</sup>* : *Ai* → *A*. If *ai* ∈ *Ai*, put *aj* = ϕ*i j*(*ai*) ∈ *Aj* if *j* ≥ *i*, and *aj* = 0 otherwise. This gives an element *a* ∈ *A*loc whose image in *A*loc/*N* ⊂ *A* is ϕ*i*(*a*). A computation shows that if *i* ≤ *j*, then ϕ*<sup>j</sup>* ◦ϕ*i j* = ϕ*i*. Using this fact, it follows that if we put *<sup>A</sup>*˜*<sup>i</sup>* <sup>=</sup> <sup>ϕ</sup>*i*(*Ai*) <sup>⊂</sup> *<sup>A</sup>*, then *<sup>A</sup>*˜*<sup>i</sup>* <sup>⊆</sup> *<sup>A</sup>*˜ *<sup>j</sup>* whenever *<sup>i</sup>* <sup>≤</sup> *<sup>j</sup>*, and hence *<sup>A</sup>* may be rewritten as the norm-closure of the union of the *<sup>A</sup>*˜*i*, i.e.,

$$A = \overline{\bigcup\_{i} \tilde{A}\_{i}}^{\|\cdot\|}. \tag{C.316}$$

In the simple situation where the maps ϕ*i j* are inclusions and hence isometries, as in our examples, we have *<sup>N</sup>* <sup>=</sup> {0}, so that *<sup>A</sup>*˜*<sup>i</sup>* <sup>=</sup> *Ai*, and hence (C.316) simplifies to

$$A = \overline{\bigcup\_{i} A\_{i}}^{\|\cdot\|}. \tag{C.317}$$

As a case in point, define (*An*,ϕ*nm*) as in (C.294) - (C.295). The infinite tensor product of the *Ci* is then defined through (C.315) and (C.295), i.e., by definition,

$$\mathbb{C}\mathbb{\hat{\otimes}\_{i=1}^{n}\mathbf{C}\_{i} = \underline{\lim}\_{n} \mathbb{\hat{\otimes}\_{i=1}^{n}\mathbf{C}\_{i} = \overline{\bigcup\_{n} \mathbb{\hat{\otimes}\_{i=1}^{n}\mathbf{C}\_{i}}}^{\|\cdot\|}.\tag{\text{C.318}}$$

Here the first equation is general, and in the second it is understood that for any *<sup>m</sup>* <sup>&</sup>gt; *<sup>n</sup>*, we have <sup>⊗</sup><sup>ˆ</sup> *<sup>n</sup> <sup>i</sup>*=<sup>1</sup>*Ci* <sup>⊂</sup> <sup>⊗</sup><sup>ˆ</sup> *<sup>m</sup> <sup>i</sup>*=<sup>1</sup>*Ci* through the embeddings (C.295).

More generally, let (*Ax*)*x*∈*<sup>X</sup>* be a family of unital C\*-algebras indexed by an arbitrary set *X*, and let *I* = P*f*(*X*) the set of finite subsets of *X*, partially ordered by inclusion. For any *F* ∈ *I*, we have a tensor product

$$A\_F = \hat{\otimes}\_{\mathbf{x} \in F} A\_{\mathbf{x}},\tag{\text{C.319}}$$

where once again <sup>⊗</sup><sup>ˆ</sup> is an arbitrary completed C\*-tensor product. An explicit construction of this tensor product along the lines of (C.289) requires an ordering of *F*, but two such orderings give canonically isomorphic C\*-algebras; if *F* ⊂ *G*, one should order *G* compatibly with *F* for the connecting homomorphisms ϕ*FG* to be well defined by (C.295). This gives a directed system of C\*-algebras (*AF*,ϕ*FG*), whose inductive limit defines the tensor product over *A*, i.e.,

$$
\hat{\otimes}\_{\mathfrak{x}\in X} A\_{\mathfrak{x}} = \underline{\lim}\_{F} \hat{\otimes}\_{\mathfrak{x}\in FA} A\_{\mathfrak{x}}.\tag{C.320}
$$

As a special case, we may rewrite our earlier algebras *<sup>A</sup>*<sup>Λ</sup> and *<sup>A</sup>*(*c*) <sup>Λ</sup> as

$$A\_{\Lambda} = \oslash\_{\mathfrak{x} \in \Lambda} \mathcal{B}(H); \tag{\text{C.321}}$$

$$A\_A^{(c)} = \otimes\_{\mathfrak{x} \in \Lambda} \mathcal{C}(\underline{n}) \cong \mathcal{C}\left(\prod\_{\underline{x} \in \Lambda} \underline{n}\right),\tag{C.322}$$

cf. (C.313). Hence we have

$$
\underline{\lim}\_{\Lambda} A\_{\Lambda} = \bigotimes\_{\underline{\chi} \in \mathbb{Z}^d} B(H);
\tag{C.323}
$$

$$\underline{\lim}\_{A} A\_A^{(c)} = \bigotimes\_{\underline{\chi} \in \mathbb{Z}^d} C(\underline{\eta}) \cong C\left(\prod\_{\underline{\chi} \in \mathbb{Z}^d} \underline{\underline{n}}\right),\tag{C.324}$$

where in the last expression the infinite product <sup>∏</sup>*x*∈Z*<sup>d</sup> <sup>n</sup>* is endowed with the product topology, so that (by Tychonoff's Theorem) the space in question is compact. Thus the ensuing inductive limit may directly be expressed as the standard commutative C\*-algebra *<sup>C</sup>*(*X*), where *<sup>X</sup>* <sup>=</sup> <sup>∏</sup>*x*∈Z*<sup>d</sup> <sup>n</sup>* is compact, equipped with pointwise operations and the sup-norm. If *n* = 2 and *d* = 1, this is a model of the Cantor set.

The homomorphisms ϕ*<sup>i</sup>* enable us to state the universal character of *A*:

Theorem C.103. *Let* (*Ai*,ϕ*i j*) *a directed system of C\*-algebras with inductive limit A. For any C\*-algebra B endowed with a family homomorphisms* β*<sup>i</sup>* : *Ai* → *B such that* β*<sup>j</sup>* ◦ϕ*i j* = β*i, there is a unique homomorphism* β : *A* → *B such that* β*<sup>j</sup>* = β ◦ϕ*j. In other words, the following diagram commutes:*

$$A\_i \xrightarrow{\Phi\prime} A\_j \xrightarrow{\Phi\prime}\_{\begin{subarray}{c}\mathfrak{g}\prime\\B\end{subarray}} A \tag{C.325}$$

*Proof.* This is true almost by construction, or rather by (C.316): since β is supposed to be a homomorphism of C\*-algebras, it is continuous, so it is determined by its values on the dense subalgebra + *<sup>i</sup> A*˜*i*, and hence by its values on each *A*˜*i*. But these values are necessarily given by β(ϕ*i*(*ai*)) = β*i*(*ai*), where *ai* ∈ *Ai*. -

Corollary C.104. *Let*(*Ax*)*x*∈*<sup>X</sup> be a family of mutually commuting unital C\*-subalgebras of a unital C\*-algebra B (sharing the unit of B), such that the C\*-algebra generated by all subalgebras Ax within B is equal to B. Also, let* <sup>⊗</sup><sup>ˆ</sup> *be some completed C\*-tensor product such that for each finite subset F* = {*x*1,..., *xn*} ⊂ *X, there is an injective homomorphism* <sup>ϕ</sup>*<sup>F</sup>* : *AF* <sup>→</sup> *B (where AF* <sup>=</sup> *Ax*1⊗··· <sup>ˆ</sup> <sup>⊗</sup><sup>ˆ</sup> *Axn ) satisfying*

$$\mathfrak{g}\_F(a\_1 \otimes \cdots \otimes a\_n) = a\_1 \cdots a\_n \ (a\_1 \in A\_{x\_1}, \dots, a\_n \in A\_{x\_n}).\tag{C.326}$$

*Then B* <sup>∼</sup><sup>=</sup> <sup>⊗</sup><sup>ˆ</sup> *<sup>x</sup>*∈*<sup>X</sup> Ax.*

*Proof.* In Theorem C.103, take *Aj AF* and β*<sup>j</sup>* ϕ*F*. In view of (C.320), this gives a homomorphism <sup>β</sup> : <sup>⊗</sup><sup>ˆ</sup> *<sup>x</sup>*∈*<sup>X</sup> Ax* <sup>→</sup> *<sup>B</sup>*. Here, this map is an isomorphism. - Finally, we give a result on infinite tensor products of states, needed in §8.4.

Proposition C.105. *Let* (*Ci*)*i*∈<sup>N</sup> *be unital C\*-algebras, and define their infinite (projective) tensor product* <sup>⊗</sup><sup>ˆ</sup> <sup>∞</sup> *<sup>i</sup>*=<sup>1</sup>*Ci as in* (C.318)*. For each i* ∈ N*, let* ω*<sup>i</sup> be a state on Ci. Then there is a unique state* <sup>⊗</sup><sup>ˆ</sup> <sup>∞</sup> *<sup>i</sup>*=1ω*<sup>i</sup> on* <sup>⊗</sup><sup>ˆ</sup> <sup>∞</sup> *<sup>n</sup>*=<sup>1</sup>*Ci such that for each N* ∈ N *and ci* ∈*Ci,*

$$\left(\hat{\otimes}\_{l=1}^{\infty}\mathbf{o}\_{l}(\mathfrak{q}\_{n}(c\_{1}\otimes\cdots\otimes c\_{n}))\right) = \prod\_{n=1}^{n}\mathbf{o}\_{l}(c\_{l}).\tag{C.327}$$

*Moreover,* <sup>⊗</sup><sup>ˆ</sup> <sup>∞</sup> *<sup>i</sup>*=<sup>1</sup> *is pure iff each* ω*<sup>i</sup> is pure.*

*Proof.* We write *<sup>C</sup><sup>n</sup>* <sup>≡</sup> <sup>⊗</sup><sup>ˆ</sup> *<sup>n</sup> <sup>i</sup>*=<sup>1</sup>*Ci*, and similarly <sup>⊗</sup><sup>ˆ</sup> *<sup>n</sup> <sup>i</sup>*=1ω*<sup>i</sup>* <sup>≡</sup> <sup>ω</sup>*n*, also for *<sup>n</sup>* <sup>=</sup> <sup>∞</sup>.

Eq. (C.327) defines <sup>ω</sup><sup>∞</sup> on a dense subset <sup>∪</sup>*n*∈Nϕ*n*(*Cn*) of *<sup>C</sup>*∞, which proves uniqueness. Existence comes from Proposition C.98, according to which the map *<sup>c</sup>*<sup>1</sup> ⊗···⊗ *cn* → <sup>∏</sup>*<sup>n</sup> <sup>i</sup>*=1ω*i*(*ci*) extends to a state <sup>⊗</sup>*<sup>n</sup> <sup>i</sup>* ω *<sup>i</sup>* on *Cn*, which in turn defines a state <sup>ω</sup>*<sup>n</sup>* on <sup>ϕ</sup>*n*(*Cn*) <sup>⊂</sup> *<sup>C</sup>*∞. Since (⊗*<sup>n</sup> <sup>i</sup>* ω *<sup>i</sup>*)|*C<sup>m</sup>* <sup>=</sup> <sup>⊗</sup>*<sup>m</sup> <sup>i</sup>* ω *<sup>i</sup>* whenever *m* ≤ *n*, one also has ω*n* <sup>|</sup>ϕ*m*(*Cm*) <sup>=</sup> <sup>ω</sup>*m*, so that we may define a functional <sup>ω</sup><sup>∞</sup> on <sup>∪</sup>*n*ϕ*n*(*Cn*) by its restrictions ω<sup>∞</sup> <sup>|</sup>*C<sup>n</sup>* <sup>=</sup> <sup>ω</sup>*n*. Since <sup>ω</sup>*<sup>n</sup>* is a state and hence satisfies ω*n* <sup>=</sup> <sup>ω</sup>*n*(1ϕ*n*(*C<sup>n</sup>*)) = 1, so does ω<sup>∞</sup> (on its dense domain). Since the continuous extension of ω<sup>∞</sup> to *C*<sup>∞</sup> has the same norm, this extension (still called ω∞) is a state by Proposition C.5.

One direction of the second claim is trivial: if at least one of the ω*<sup>i</sup>* fails to be pure, then ω*<sup>n</sup>* inherits its convex decomposition so to speak, so contrapositively we obtain that purity of ω*<sup>n</sup>* implies purity of each ω*i*. We first prove the opposite direction for *n* < ∞. Using Proposition C.91 and the fact that *C<sup>n</sup>* is a completion of the algebraic tensor product <sup>⊗</sup>*<sup>n</sup> <sup>i</sup>*=<sup>1</sup>*Ci*, the GNS-representation πω*<sup>n</sup>* (*Cn*) is unitarily equivalent to the representation πω<sup>1</sup> ⊗···⊗πω*<sup>n</sup>* on *H*ω<sup>1</sup> ⊗···⊗*H*ω*<sup>n</sup>* , and

$$(\pi\_{\mathfrak{o}\mathfrak{o}\_{\mathbb{I}}} \otimes \cdots \otimes \pi\_{\mathfrak{o}\mathfrak{o}\_{\mathbb{I}}}(\mathbb{C}^{\mathfrak{n}}))'' = \pi\_{\mathfrak{o}\mathfrak{o}\_{\mathbb{I}}}(\mathbb{C}\_{\mathbb{I}})'' \overline{\otimes} \cdots \overline{\otimes} \pi\_{\mathfrak{o}\mathfrak{o}\_{\mathbb{I}}}(\mathbb{C}\_{\mathbb{N}})''.\tag{\text{C.328}}$$

Here, for any two von Neumann algebras *A* and *B*, *A*⊗*B* is the smallest von Neumann algebra containing the algebraic tensor product *A*⊗*B*. The main lemma behind the second claim is the nontrivial *commutation theorem* for von Neumann algebras:

$$(A \overline{\otimes} B)' = A' \overline{\otimes} B',\tag{C.329}$$

which we state without proof. This iterates to *n* von Neumann algebras. Hence

$$(\pi\_{\mathfrak{o}\mathfrak{o}\_1} \otimes \cdots \otimes \pi\_{\mathfrak{o}\mathfrak{o}\_{\mathfrak{h}}}(\mathbf{C}^n))' = \pi\_{\mathfrak{o}\mathfrak{o}\_1}(\mathbf{C}\_1)' \overline{\otimes} \cdots \overline{\otimes} \pi\_{\mathfrak{o}\mathfrak{o}\_{\mathfrak{h}}}(\mathbf{C}\_n)',\tag{\text{C.330}}$$

so that the claim for *n* < ∞ follows from Theorem C.90.

Now take *n* = ∞, and assume each ω*<sup>i</sup>* is pure. Suppose that for some *t* ∈ (0,1),

$$
\alpha \omicron^{\infty} = t \alpha \prime + (1 - t) \alpha \prime \prime,\tag{C.331}
$$

and restrict this equality to ϕ*n*(*Cn*). By the previous argument, the restriction of ω<sup>∞</sup> to <sup>ϕ</sup>*n*(*Cn*), which is just <sup>ω</sup>*n*, is pure for any *<sup>n</sup>* <sup>∈</sup> <sup>N</sup>. This gives <sup>ω</sup> <sup>|</sup>ϕ*n*(*Cn*) <sup>=</sup> <sup>ω</sup> <sup>|</sup>ϕ*n*(*Cn*) . This is true for each *n*, so that ω = ω. Hence ω<sup>∞</sup> is pure. -

#### C.15 Gelfand isomorphism and Fourier theory

One of the most beautiful applications of Theorem C.8 is to commutative harmonic analysis. Let *G* be an *abelian* locally compact Hausdorff group (e.g., *G* = R, *G* = Z, or *G* = T). Such groups have an invariant *Haar measure dx*, which satisfies

$$\int\_{G} d\mathbf{x} L\_{\mathfrak{I}} f(\mathbf{x}) = \int\_{G} d\mathbf{x} f(\mathbf{x}^{-1}) = \int\_{G} d\mathbf{x} f(\mathbf{x}),\tag{\text{C.332}}$$

for any *f* ∈ *Cc*(*G*) and *y* ∈ *G*, where

$$L\_{\mathfrak{Y}}f(\mathfrak{x}) = f(\mathfrak{y}^{-1}\mathfrak{x}).\tag{\text{C.333}}$$

This measure is unique up to rescaling; if *G* is compact, it is normalized such that *<sup>G</sup> dx* = 1. For *G* = R, this recovers Lebesgue measure on R, whilst for Z and T,

$$\int\_{\mathbb{Z}} d\mathbf{x} \, f(\mathbf{x}) = \sum\_{n \in \mathbb{Z}} f(n);\tag{\text{C.334}}$$

$$\int\_{\mathbb{T}} d\mathbf{x} \, f(\mathbf{x}) = \int\_0^{2\pi} \frac{d\theta}{2\pi} \, f(e^{i\theta}). \tag{C.335}$$

For *f*,*g* ∈ *Cc*(*G*), the *convolution product f* ∗ *g* is defined by

$$f \ast \mathbf{g}(\mathbf{x}) = \int\_{G} d\mathbf{y} \, f(\mathbf{y}) \mathbf{g}(\mathbf{y}^{-1}\mathbf{x}). \tag{C.336}$$

Using (C.332), it is easy to verify that this product is commutative and associative. Also, one may define an involution on *Cc*(*G*) by

$$f^\*(\mathbf{x}) = \overline{f(\mathbf{x}^{-1})}.\tag{\text{C.337}}$$

We would now like to turn *Cc*(*G*) into a commutative C\*-algebra, but the obvious norms like the *<sup>L</sup>p*-ones do not accomplish this. Instead, for *<sup>f</sup>* <sup>∈</sup> *Cc*(*G*) we define an operator π(*f*) on the Hilbert space *L*2(*G*) (defined with respect to Haar measure) by

$$
\pi(f)\Psi = f \ast \Psi,\tag{\text{C.338}}
$$

initially for ψ ∈ *Cc*(*G*). Equivalently, we may write

$$
\pi(f) = \int\_G d\mathbf{y} \, f(\mathbf{y}) L\_{\mathbf{y}},\tag{C.339}
$$

where we regard *Ly* as an (obviously unitary) operator on *L*2(*G*), and the integral is most easily defined weakly, i.e., π(*f*) is the unique bounded operator for which

$$
\langle \!\langle \boldsymbol{\Phi}, \boldsymbol{\pi}(f) \!\!/ \Psi \rangle = \int\_{G} d\mathbf{y} \, f(\mathbf{y}) \langle \!\langle \boldsymbol{\Phi}, L\_{\text{y}} \!\!/ \Psi \rangle. \tag{C.340}
$$

Since *Ly* is unitary, this formula also shows that |ϕ,π(*f*)ψ| ≤ *f* 1ϕψ, where *<sup>f</sup>* <sup>1</sup> <sup>=</sup> *<sup>G</sup> dx* | *f*(*x*)|. Taking ϕ = π(*f*)ψ gives π(*f*)ψ≤ *f* 1ψ, whence

$$\|\pi(f)\| \le \|f\|\_1. \tag{C.341}$$

Hence π(*f*) is bounded and extends from *Cc*(*G*) to all of *L*2(*G*) by continuity.

Lemma C.106. *The map f* → <sup>π</sup>(*f*) *from Cc*(*G*) *to B*(*L*2(*G*)) *is injective and satisfies*

$$
\pi(f \ast g) = \pi(f)\pi(g);\tag{C.342}
$$

$$
\pi(f^\*) = \pi(f)^\*. \tag{C.343}
$$

*Proof.* Eq. (C.342) follows from associativity of convolution, and (C.343) follows from the last equality in (C.332). To prove injectivity, we fix *f* ∈*Cc*(*G*), pick ε > 0, and find a neighbourhood *<sup>U</sup>* of *<sup>e</sup>* <sup>∈</sup> *<sup>G</sup>* such that *<sup>y</sup>*−1*<sup>x</sup>* <sup>∈</sup> *<sup>U</sup>* implies <sup>|</sup> *<sup>f</sup>*(*y*)<sup>−</sup> *<sup>f</sup>*(*x*)<sup>|</sup> <sup>&</sup>lt; <sup>ε</sup>. Then, using Urysohn's Lemma, one may find a positive function ψ*<sup>U</sup>* ∈ *Cc*(*U*) such that *<sup>U</sup>* ψ*<sup>U</sup>* = 1. Injectivity of π then immediately follows from the easy estimate

$$|f \ast \Psi\_U(\mathbf{x}) - f(\mathbf{x})| \le \int\_G d\mathbf{y} \, |f(\mathbf{y}) - f(\mathbf{x})| \cdot |\Psi\_U(\mathbf{y}^{-1}\mathbf{x})| < \mathfrak{e}. \tag{7}$$

Definition C.107. *Let G be an abelian locally compact Hausdorff group. The* group C\*-algebra *C*∗(*G*) *is the norm closure of* π(*Cc*(*G*)) *in B*(*L*2(*G*))*, with norm*

$$\|f\|\_{C^\*} = \|\pi(f)\|\_{B(L^2(G))}.\tag{C.344}$$

Since π(*Cc*(*G*)) is a commutative <sup>∗</sup>-algebra in *B*(*L*2(*G*)) by Lemma C.106, it is easy to see (from joint continuity of multiplication) that its norm closure *C*∗(*G*) is a commutative C\*-algebra, whose Gelfand spectrum we wish to compute.

To this effect, we first define the *dual group* or *character group G*ˆ of *G* as

$$
\hat{G} = \text{Hom}(G, \mathbb{T}),
\tag{C.345}
$$

i.e., the set of continuous group homomorphisms from *G* to T, equipped with the *compact-open topology*. This topology is defined as the restriction to Hom(*G*,C) of the topology on *<sup>C</sup>*(*G*,C) generated by the neigbourhood basis of some <sup>γ</sup> <sup>∈</sup> *<sup>G</sup>*ˆ, i.e.,

$$O(\boldsymbol{\gamma}, K, \boldsymbol{\varepsilon}) = \{ \boldsymbol{\Phi} \in \hat{G} : |\boldsymbol{\gamma}(\boldsymbol{x}) - \boldsymbol{\Phi}(\boldsymbol{x})| < \boldsymbol{\varepsilon} \,\forall \boldsymbol{x} \in K \}, \tag{C.346}$$

where *K* ∈ K (*G*) and ε > 0. The corresponding notion of convergence is uniform convergence on each compact subset of *G*; in particular, if *G* is compact, this is just uniform convergence. Equipped with this topology, it can be shown that *G*ˆ is itself an abelian locally compact Hausdorff group under pointwise operations, i.e.,

$$(\mathcal{N}\mathcal{V})(\mathbf{x}) = \mathcal{N}(\mathbf{x})\mathcal{V}(\mathbf{x});\tag{\text{C.347}}$$

$$
\overline{\mathcal{Y}}^{-1}(\mathbf{x}) = \overline{\mathcal{Y}(\mathbf{x})};\tag{\text{C.348}}
$$

hence the ensuing unit ˆ*e* in *G*ˆ is the identity function ˆ*e* = 1*<sup>G</sup>* in Hom(*G*,T).

Proposition C.108. *We have the following examples of dual groups:*

$$
\mathbb{Z} \cong \mathbb{T}, \quad \mathbb{Y}\_{\mathbb{C}}(n) = \mathbb{z}^n; \tag{C.349}
$$

$$\mathbb{R} \cong \mathbb{R}, \quad \gamma\_p(\mathbf{x}) = e^{i p \mathbf{x}}; \tag{\text{C.350}}$$

$$
\uparrow \Upsilon \cong \mathbb{Z}, \quad \mathfrak{y}\_{\mathfrak{n}}(z) = z^{\mathfrak{n}}; \tag{C.351}
$$

$$
\mathbb{Z}\_p \cong \mathbb{Z}\_p, \quad \gamma\_{[m]}([n]) = e^{2\pi imn/p}. \tag{C.352}
$$

Here Z*<sup>p</sup>* = Z/(*p* ·Z) is the (finite) group of integers mod *p*.

*Proof.* For (C.349), any character γ : Z → T is determined by its value γ(1) = *z*, since for *<sup>n</sup>* <sup>&</sup>gt; 0 we have <sup>γ</sup>(*n*) = <sup>γ</sup>(<sup>1</sup> <sup>+</sup> ··· <sup>+</sup> <sup>1</sup>) = <sup>γ</sup>(1)*<sup>n</sup>* <sup>=</sup> *zn*, where the sum has *<sup>n</sup>* terms; for *<sup>n</sup>* <sup>&</sup>lt; 0, we obtain the same result from <sup>γ</sup>(*n*) = <sup>γ</sup>(−*n*)−<sup>1</sup> = (*z*−*n*)−<sup>1</sup> <sup>=</sup> *<sup>z</sup>n*.

To prove (C.350), we need to solve γ(*x* + *y*) = γ(*x*)γ(*y*) with γ(0) = 1, where γ : R → T is continuous. To see that (C.350) gives all solutions, find ε > 0 for which ε <sup>0</sup> *dy* γ(*y*) ≡ *a* > 0; this is possible, since γ(0) = 1 and γ is continuous. Then

$$\int\_{0}^{\varepsilon} d\mathbf{y} \,\eta(\mathbf{y})\,\eta(\mathbf{x}) = \int\_{0}^{\varepsilon} d\mathbf{y} \,\eta(\mathbf{x} + \mathbf{y}) = \int\_{\mathbf{x}}^{\varepsilon + \mathbf{x}} d\mathbf{y} \,\eta(\mathbf{y}),\tag{C.353}$$

so that γ is differentiable, with, writing γ˙ for *d*γ/*dx*,

$$a\dot{\chi}(\mathbf{x}) = \chi(\mathbf{c} + \mathbf{x}) - \chi(\mathbf{x}) = (\chi(\mathbf{c}) - 1)\chi(\mathbf{x}).\tag{C.354}$$

Hence γ˙(*x*) = *c*γ(*x*) with *c* = (γ(ε)−1)/*a*, so that γ(*x*) = exp(*cx*). Since |γ(*x*)| = 1, this forces *c* = *ip* for some *p* ∈ R. This also implies (C.351), since T = R/Z and hence the characters of T are those characters of R that map Z to 1. Similarly, (C.352) follows from (C.349): the characters on Z that are trivial on *p* ·Z take the form <sup>γ</sup>(*n*) = *zn* for some *<sup>p</sup>*-roots of unity *<sup>z</sup>* <sup>=</sup> exp(2π*im*/*p*), *<sup>m</sup>* ∈ {1,..., *<sup>p</sup>*}. -

Theorem C.109. *Let G be an abelian locally compact Hausdorff group. Then the Gelfand spectrum* Σ(*C*∗(*G*)) *is homeomorphic to G, and the Gelfand isomorphism* ˆ

$$\mathcal{C}^\*(G) \cong \mathcal{C}\_0(\hat{G}) \tag{C.355}$$

*is given on the dense subspace Cc*(*G*) ⊂*C*∗(*G*) *by the generalized Fourier transform*

$$
\hat{f}(\boldsymbol{\gamma}) = \int\_G d\boldsymbol{x} \,\overline{\mathcal{Y}(\boldsymbol{x})} f(\boldsymbol{x}).\tag{C.356}
$$

Thus the Fourier transform is a special case of the Gelfand transform (which is noteworthy if only because Gelfand himself promulgated the unity of mathematics).

*Proof.* We will prove that each character <sup>γ</sup> <sup>∈</sup> *<sup>G</sup>*<sup>ˆ</sup> on *<sup>G</sup>* defines a character ωγ on *C*∗(*G*) by continuous extension (i.e., from its dense subspace *Cc*(*G*) to *C*∗(*G*)) of

$$a\mathfrak{o}\_{\mathcal{I}}(f) = \hat{f}(\mathcal{Y}),\tag{C.357}$$

as in (C.356), and that the map <sup>γ</sup> → ωγ gives a homeomorphism *<sup>G</sup>*<sup>ˆ</sup> <sup>∼</sup><sup>=</sup> → Σ(*C*∗(*G*)). It follows from simple computations that for *f*,*g* ∈ *Cc*(*G*), one has

$$a\_{\mathcal{I}}(f \ast g) = a\_{\mathcal{I}}(f) a\_{\mathcal{I}}(g);\tag{C.358}$$

$$\mathfrak{o}\_{\mathfrak{I}}(f^\*) = \mathfrak{o}\_{\mathfrak{I}}(f). \tag{C.359}$$

To finish the proof, we need three further nontrivial facts about the map γ → ωγ :


$$|\mathfrak{o}\_{\mathcal{I}}(f)| \le \|f\|\_{\mathcal{C}^\*},\tag{C.360}$$

Thus ωγ : *Cc*(*G*) → C may be extended to *C*∗(*G*) by continuity in the usual way. 3. The compact-open topology on *G*ˆ is mapped to the Gelfand topology on Σ(*C*∗(*G*)).

To prove the first point, we restrict a character ω : *C*∗(*G*) → C to *Cc*(*G*) and note that because of the bound (C.341), this restriction in turn extends to an element of *L*1(*G*)∗, which we still call ω. Entry 10 in Table B.1 gives *L*1(*X*)<sup>∗</sup> ∼= *L*∞(*X*), in the sense that any <sup>ϕ</sup> <sup>∈</sup> *<sup>L</sup>*1(*X*)<sup>∗</sup> is given by <sup>ϕ</sup>*f*(*g*) = *<sup>X</sup> f g* for some *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*∞(*X*). Hence

$$\mathfrak{so}(f) = \int\_G dx \, \mathfrak{d}(\mathfrak{x}) f(\mathfrak{x}),\tag{C.361}$$

where <sup>ω</sup>˜ <sup>∈</sup> *<sup>L</sup>*∞(*G*). The multiplicative property <sup>ω</sup>(*<sup>f</sup>* <sup>∗</sup> *<sup>g</sup>*) = <sup>ω</sup>(*f*)ω(*g*) then gives

$$
\mathfrak{d}(\mathfrak{x}\mathfrak{y}) = \mathfrak{d}(\mathfrak{x})\mathfrak{d}(\mathfrak{y})\tag{\text{C.362}}
$$

almost everywhere (a.e.) with respect to Haar measure.

To prove continuity of ω˜ , compare the following expressions with *f*,*g* ∈ *Cc*(*G*):

$$
\mathfrak{o}(f)\mathfrak{o}(\mathfrak{g}) = \mathfrak{o}(f) \int\_G d\mathfrak{x} \,\mathfrak{do}(\mathfrak{x})\mathfrak{g}(\mathfrak{x});
$$

$$
\mathfrak{o}(f\*\mathfrak{g}) = \int\_G d\mathfrak{x} \,\mathfrak{o}(L\_\mathfrak{x}f)\mathfrak{g}(\mathfrak{x}).
$$

These must coincide, so if we pick some *f* ∈ *Cc*(*G*) for which ω(*f*) = 0 (which is possible since*Cc*(*G*) is dense in*C*∗(*G*) and ω is not identically zero), then we obtain

$$\mathfrak{o}(\mathfrak{x}) = \mathfrak{o}(L\_{\mathfrak{x}}f)/\mathfrak{o}(f),\tag{\text{C.363}}$$

almost everywhere. Hence we may redefine ω˜ by (C.363) for all *x* ∈ *G*. Since

$$|\mathfrak{so}(L\_{\mathfrak{x}}f) - \mathfrak{so}(L\_{\mathfrak{y}}f)| \le \|L\_{\mathfrak{x}}f - L\_{\mathfrak{y}}f\|\_{C^\*} \le \|L\_{\mathfrak{x}}f - L\_{\mathfrak{y}}f\|\_1 \le C\|L\_{\mathfrak{x}}f - L\_{\mathfrak{y}}f\|\_{\infty},\tag{C.364}$$

recalling that *f* has compact support, it follows that the function *x* → ω(*Lx f*) is continuous, whence also ω˜ as redefined by (C.363) is continuous.

We now show that ω˜(*x*) ∈ T. If |ω˜(*x*)| > 1, then ω˜ cannot be bounded (whereas we know it lies in *L*∞(*G*)), because ω˜(*xn*) = ω˜(*x*)*<sup>n</sup>* by (C.362). But the same is true if <sup>|</sup>ω˜(*x*)<sup>|</sup> <sup>&</sup>lt; 1, because using <sup>ω</sup>˜(*x*−1) = <sup>ω</sup>˜(*x*)−<sup>1</sup> (which follows from (C.362) and (C.363), which gives ω˜(*e*) = 1), the same argument applies with *x*−<sup>1</sup> instead of *x*. Thus <sup>ω</sup>˜ : *<sup>G</sup>* <sup>→</sup> <sup>T</sup> is a character <sup>γ</sup> <sup>∈</sup> *<sup>G</sup>*<sup>ˆ</sup> (where the bar is conventional), so that (C.361) turns into (C.356). As to injectivity, if ˆ*f*(γ) = ˆ*f*(γ ) for all *f* ∈ *Cc*(*G*), then

$$\int\_{G} d\mathbf{x} \left( \overline{\mathcal{Y}(\mathbf{x})} - \overline{\mathcal{Y}(\mathbf{x})} \right) f(\mathbf{x}) = \mathbf{0},\tag{\text{C.365}}$$

for all such *f* , which by standard integration theory gives γ = γ a.e. and hence everywhere, since both functions are continuous. To prove (C.360), we use a trick: take some fixed <sup>ω</sup><sup>0</sup> <sup>∈</sup> <sup>Σ</sup>(*C*∗(*G*)), so that <sup>ω</sup>0(*f*) = <sup>ˆ</sup>*f*(γ0) for some <sup>γ</sup><sup>0</sup> <sup>∈</sup> *<sup>G</sup>*<sup>ˆ</sup> by the previous step of the proof, and ω0(*f*)≤ *<sup>f</sup> C*<sup>∗</sup> for all *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*G*). For <sup>γ</sup> <sup>∈</sup> *<sup>G</sup>*<sup>ˆ</sup> and *f* ∈ *Cc*(*G*), eqs. (C.356) and (C.347) give ωγ (*f*) = ω0(γγ<sup>0</sup> *f*), where γγ<sup>0</sup> *f* is the *pointwise* product of the three given functions from *G* to C. Hence

$$|\alpha\_{\mathcal{I}}(f)| = |\alpha\_0(\overline{\gamma}\gamma\_0 f)| \le ||\pi(\overline{\gamma}\gamma\_0)f|| = ||\overline{\gamma}\gamma\_0 f||\_{\mathcal{C}^\*}.\tag{C.366}$$

We now denote γγ<sup>0</sup> by γ , which lies in *<sup>G</sup>*ˆ, and note that for any <sup>γ</sup> <sup>∈</sup> *<sup>G</sup>*ˆ, we have

$$
\langle \langle \boldsymbol{\Phi}, \boldsymbol{\pi}(\boldsymbol{\gamma}^{\prime}f) \boldsymbol{\Psi} \rangle = \langle \overline{\gamma}^{\prime} \boldsymbol{\Phi}, \boldsymbol{\pi}(f)(\overline{\gamma}^{\prime} \boldsymbol{\Psi}) \rangle \text{ (\$\boldsymbol{\varphi}\$, \$\boldsymbol{\Psi} \in L^{2}(G), f \in C\_{c}(G)). \tag{C.367}
$$

Taking ϕ = π(γ *f*)ψ and using Cauchy–Schwarz as well as γ ϕ = ϕ, gives

$$\left| \left| \pi(\gamma' f)\Psi \right| \right| \le \left\| \pi(f)\overline{\gamma}'\Psi \right\| \left( \Psi \in L^2(G), f \in \mathcal{C}\_c(G), \gamma' \in G \right). \tag{C.368}$$

Taking the sup over all <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(*G*) with ψ <sup>=</sup> 1 (which also means <sup>γ</sup> ψ = 1) gives π(γ *f*)≤π(*f*). Combined with (C.366) and (C.360), this gives the bound

$$|\mathfrak{o}\_{\mathcal{I}}(f)| \le \|f\|\_{C^\*}.\tag{C.369}$$

We now prove continuity of the map ωγ <sup>→</sup> <sup>γ</sup> from <sup>Σ</sup>(*C*∗(*G*)) to *<sup>G</sup>*<sup>ˆ</sup> (using sequences for simplicity, the argument for nets being similar). If ωγ*<sup>n</sup>* → ωγ , i.e., <sup>ˆ</sup>*f*(γ*n*) <sup>→</sup> <sup>ˆ</sup>*f*(γ) for each *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*G*), and hence for each *<sup>f</sup>* <sup>∈</sup> *Cc*(*G*), then <sup>γ</sup>*<sup>n</sup>* <sup>→</sup> <sup>γ</sup> uniformly on any *K* ⊂ K (*G*). Writing γ *<sup>n</sup>* = γ*n*γ and *g* = *f* γ, we first notice that

$$|\chi\_n(\mathbf{x}) - \chi(\mathbf{x})| = |\chi\_n'(\mathbf{x}) - 1|;\tag{\text{C.370}}$$

$$
\hat{f}(\boldsymbol{\gamma}\_{\mathfrak{n}}) - \hat{f}(\boldsymbol{\gamma}) = \hat{\mathfrak{g}}(\boldsymbol{\gamma}\_{\mathfrak{n}}') - \hat{\mathfrak{g}}(1\_G). \tag{C.371}
$$

This shows that we may reduce the proof to the case γ = 1*G*; otherwise, simply change γ*<sup>n</sup>* to γ *<sup>n</sup>*. Thus we assume that <sup>ˆ</sup>*f*(γ*n*) <sup>→</sup> <sup>ˆ</sup>*f*(1*G*) for each *<sup>f</sup>* <sup>∈</sup> *Cc*(*G*). We now pick some fixed *<sup>g</sup>* <sup>∈</sup> *Cc*(*G*) such that ˆ*g*(1*G*) = *<sup>G</sup> dxg*(*x*) = 1. For ε > 0, by uniform continuity there is a neighbourhood *U* of the identity *e* ∈ *G* such that, cf. (C.364),

$$\|L\_{\mathfrak{u}}\mathfrak{g} - \mathfrak{g}\|\_{1} < \mathfrak{e}/\mathfrak{z} \text{ ( $\mathfrak{u} \in U$ )}.\tag{C.372}$$

Then ∪*x*∈*<sup>G</sup> xU* covers *G*, and hence also covers each compact set *K* ⊂ *G*. Therefore, *K* has a finite subcover ∪*j*∈*<sup>J</sup> x jU*. Define *gj* = *Lx <sup>j</sup> g*. By invariance of the Haar measure, we have ˆ*gj*(1*G*) = 1, so that by definition of ωγ*<sup>n</sup>* → ω1*<sup>G</sup>* , we may find *N* ∈ N such that for each *j* ∈ *J* and for all *n* > *N*, we have

$$|\hat{\mathsf{g}}\_{f}(\mathsf{y}\_{n}) - 1| < \mathfrak{e}/\mathfrak{z}.\tag{C.373}$$

Also, if *x* ∈ *K*, then *x* = *xju* for some *j* ∈ *J* and *u* ∈ *U*. Eq. (C.372) then implies

$$\left| \left| \hat{\mathbf{g}}\_{j}(\mathbf{y}\_{n}) (\mathbf{y}\_{n}(\mathbf{x}) - 1) \right| = \left| \int\_{G} d\mathbf{y} \left( L\_{\mathbf{u}} \mathbf{g}(\mathbf{y}) - \mathbf{g}(\mathbf{y}) \right) \overline{\mathbf{y}\_{n}(\mathbf{y})} \right| < \varepsilon/3. \tag{C.374}$$

Hence for any *K* ∈ K (*G*) and *x* ∈ *K* as above, we may estimate, for all *n* > *N*,

$$\begin{split} |\boldsymbol{\eta}\_{n}(\mathbf{x}) - \boldsymbol{1}| &\leq |\boldsymbol{\eta}\_{n}(\mathbf{x})(1 - \boldsymbol{\hat{g}}\_{j}(\boldsymbol{\chi}\_{n}))| + |\boldsymbol{\hat{g}}\_{j}(\boldsymbol{\chi}\_{n})(\boldsymbol{\chi}\_{n}(\mathbf{x}) - \boldsymbol{1})| \\ &+ |\boldsymbol{\hat{g}}\_{j}(\boldsymbol{\chi}\_{n}) - \boldsymbol{1}| < \boldsymbol{\varepsilon}/\mathbf{3} + \boldsymbol{\varepsilon}/\mathbf{3} + \boldsymbol{\varepsilon}/\mathbf{3} = \boldsymbol{\varepsilon}. \end{split} \tag{C.375}$$

Consequently, <sup>ˆ</sup>*f*(γ*n*) <sup>→</sup> <sup>ˆ</sup>*f*(1*G*) for each *<sup>f</sup>* <sup>∈</sup>*Cc*(*G*) implies <sup>γ</sup>*<sup>n</sup>* <sup>→</sup> <sup>1</sup>*<sup>G</sup>* in *<sup>G</sup>*ˆ; as we have argued, this proves continuity of the bijection <sup>Σ</sup>(*C*∗(*G*)) <sup>→</sup> *<sup>G</sup>*<sup>ˆ</sup> given by ωγ <sup>→</sup> <sup>γ</sup>.

If Σ(*C*∗(*G*)) and *G*ˆ are compact (which is the case iff *G* is discrete, in which case *C*∗(*G*) has a unit δ*e*) we are ready, since a continuous bijection from a compact space to a Hausdorff space has a continuous inverse, and hence is a homeomorphism (in our case, both spaces are compact as well as Hausdorff). In general, continuity of the map <sup>γ</sup> → ωγ from *<sup>G</sup>*<sup>ˆ</sup> to <sup>Σ</sup>(*C*∗(*G*)) almost immediately follows from the definition of the compact-open topology on *<sup>G</sup>*ˆ: if <sup>γ</sup>*<sup>n</sup>* <sup>→</sup> <sup>γ</sup> in this topology (similarly for nets), and *<sup>f</sup>* <sup>∈</sup>*Cc*(*G*), then <sup>ˆ</sup>*f*(γ*n*) <sup>→</sup> <sup>ˆ</sup>*f*(γ), and hence ωγ*<sup>n</sup>* (*f*) <sup>→</sup> ωγ (*f*). A simple <sup>ε</sup>/3-argument then gives the same result for *f* ∈ *C*∗(*G*). -

Note that local compactness of *G*ˆ (though provable directly) also follows from this theorem, since we know this for the Gelfand spectrum Σ(*C*∗(*G*)), cf. Theorem C.45.

Beside the Gelfand isomorphism (C.355), in which the two function spaces *C*∗(*G*) and *C*0(*G*ˆ) are of a different type, there exist more symmetric versions of the generalized Fourier transform (C.356). In the setting of Banach spaces (as opposed to spaces of distributions, which would take us into the territory of locally convex topological vector spaces, and hence outside the scope of this appendix, though cf. §5.11), there are (at least) two natural possibilities. The traditional and most familiar one is provided by the Hilbert spaces *L*2(*G*) and *L*2(*G*ˆ), defined with respect to suitably normalized Haar measures *dx* (on *G*) and *d*γ (on *G*ˆ), respectively. A second, more recent possibility is to use the following two Banach spaces.

Definition C.110. *The Banach space C*∗ <sup>0</sup> (*G*) *is the completion of Cc*(*G*) *in the norm*

$$||f||\_0 = \max\{||f||\_\infty, ||\hat{f}||\_\infty\}.\tag{C.376}$$

*Similarly, the Banach space C*∗ <sup>0</sup> (*G*ˆ) *is the completion of Cc*(*G*ˆ) *in the norm*

$$\|\|\boldsymbol{\zeta}\|\|\_{0} = \max\{\|\|\boldsymbol{\zeta}\|\|\_{\infty}, \|\check{\boldsymbol{\zeta}}\|\|\_{\infty}\}.\tag{C.377}$$

It follows that*C*∗ <sup>0</sup> (*G*) can be norm-decreasingly injected into both*C*∗(*G*) and*C*0(*G*), so that *C*∗ <sup>0</sup> (*G*) is a subspace of *C*0(*G*) as well as of *C*∗(*G*). By (C.341) and (C.360),

$$L^1(G) \cap \mathcal{C}\_0(G) \subset \mathcal{C}\_0^\*(G),\tag{C.378}$$

and similarly for *C*∗ <sup>0</sup> (*G*ˆ). Indeed, *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*) (and likewise *C*<sup>∗</sup> <sup>0</sup> (*G*ˆ)) could equivalently have been defied as the completion of *<sup>L</sup>*1(*G*)∩*C*0(*G*) in the norm (C.376).

Theorem C.111. *The Fourier transform* (C.356) *induces isometric isomorphisms*

$$L^2(G) \cong L^2(\hat{G});\tag{C.379}$$

$$\mathcal{C}\_0^\*(G) \cong \mathcal{C}\_0^\*(\hat{G}),\tag{C.380}$$

*such that, on suitably normalizing dx and d*γ*, the* Fourier inversion formula

$$f(\mathbf{x}) = \int\_{\mathcal{G}} d\boldsymbol{\eta} \,\mathcal{V}(\mathbf{x}) \hat{f}(\boldsymbol{\eta}),\tag{C.381}$$

*cf.* (C.356)*, in both cases holds* verbatim *whenever f* <sup>∈</sup> *<sup>L</sup>*1(*G*) *and* <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*1(*G*ˆ)*, in which case f and* ˆ*f are continuous, and* (C.356) *and* (C.381) *hold pointwise.*

The Fourier inversion formula (C.381) is actually equivalent to its special case

$$f(e) = \int\_{\hat{G}} d\mathcal{Y} \hat{f}(\mathcal{Y}),\tag{C.382}$$

where *e* ∈ *G* is the unit, since (C.381) follows by substituting *Lx*−<sup>1</sup> *f* for *f* and using

$$
\overline{L\_{\mathfrak{x}^{-1}}} \overline{f} = \mathfrak{Y} \mathfrak{f}.\tag{C.383}
$$

It is also important to realize that conceptually, the inversion formula (C.381) reads

$$
\check{f}(\hat{\mathfrak{x}}) = f(\mathfrak{x}^{-1}),
\tag{C.384}
$$

where the Fourier transform ˇ <sup>ζ</sup> for suitable <sup>ζ</sup> : *<sup>G</sup>*<sup>ˆ</sup> <sup>→</sup> <sup>C</sup> is defined, as in (C.356), by

$$
\check{\zeta}(\chi) = \int\_{\hat{G}} d\chi \,\overline{\chi(\chi)} \,\zeta(\chi). \tag{C.385}
$$

Here <sup>χ</sup> : *<sup>G</sup>*<sup>ˆ</sup> <sup>→</sup> <sup>T</sup> is some character on *<sup>G</sup>*ˆ, i.e., <sup>χ</sup> <sup>∈</sup> <sup>ˆ</sup> *G*ˆ, and we have a natural map

$$G \to \hat{\vec{G}};\tag{C.386}$$

$$
\mathfrak{x} \mapsto \mathfrak{x};\tag{\text{C.387}}
$$

$$
\hat{\mathfrak{x}}(\mathfrak{y}) = \mathfrak{y}(\mathfrak{x}).\tag{\text{C.388}}
$$

*Pontryagin duality* states that (C.386) - (C.388) define an isomorphism, i.e.,

$$
\hat{G} \cong G.\tag{\text{C.389}}
$$

We omit the lengthy proof of this beautiful isomorphism of topological groups (cf. the examples in Proposition C.108), and turn to the proof of Theorem C.111.

*Proof.* First, we (re)construct a correctly normalized Haar measure on *G*ˆ by defining

$$\int\_{\hat{G}} \mathbf{:} \quad \mathbf{C}\_{\mathbf{c}}(\hat{\mathbf{G}}, \mathbb{R}) \to \mathbb{R}; \tag{\text{C.390}}$$

$$\mathcal{L} \hookrightarrow \inf \{ f(e) \mid f \in \mathcal{C}\_0^\*(G), \hat{f} \ge \mathcal{L} \text{ (pointwise)} \}. \tag{C.391}$$

This map takes values in <sup>R</sup>, since if <sup>ˆ</sup>*<sup>f</sup>* is real, as required by <sup>ˆ</sup>*<sup>f</sup>* <sup>≥</sup> <sup>ζ</sup> in (C.391), then, noting that the Gelfand (= Fourier) transform on *C*∗ <sup>0</sup> (*G*) maps the involution (C.337) into complex conjugation on *C*0(*G*ˆ), so is *f*(*e*), cf.(C.337). Furthermore, *G*ˆ is linear, as well as positive: if <sup>ζ</sup> <sup>≥</sup> 0 (i.e., pointwise), then also <sup>ˆ</sup>*<sup>f</sup>* <sup>≥</sup> 0 in *<sup>C</sup>*0(*G*ˆ), so that *<sup>f</sup>* <sup>≥</sup> 0 in *<sup>C</sup>*∗(*G*), because by Theorem C.109 the map *<sup>f</sup>* → <sup>ˆ</sup>*<sup>f</sup>* is an isomorphism, which by Theorem C.52 preserves positivity. This gives ψ,π(*f*)ψ*L*2(*G*) ≥ 0 for all <sup>ψ</sup> <sup>∈</sup> *<sup>L</sup>*2(*G*), which by a simple continuity argument (in a proof by contradiction, using the inclusion *C*∗ <sup>0</sup> (*G*) ⊂ *C*0(*G*)) enforces *f*(*e*) ≥ 0, and hence inf{ *f*(*e*)} ≥ 0.

By Theorem B.19, there is a measure *d*γ on *G*ˆ defining the integral *<sup>G</sup>*ˆ, i.e,

$$\int\_{\mathcal{G}} d\mathcal{Y} \mathcal{L}(\boldsymbol{\eta}) = \inf \{ f(\boldsymbol{e}) \mid \boldsymbol{f} \in C\_0^s(G), \boldsymbol{\hat{f}} \ge \boldsymbol{\zeta} \}, \tag{\text{C.392}}$$

where initially ζ is real-valued, upon which the integral is extended to *Cc*(*G*ˆ) by complex linearity, as usual in (Lebesgue) integration. The point is that the measure *d*γ is translation invariant and hence is a Haar measure on *G*ˆ: indeed, replacing *g* by *<sup>L</sup>*<sup>γ</sup>*<sup>g</sup>* amounts to replacing *<sup>f</sup>* (as a function that satisfies <sup>ˆ</sup>*<sup>f</sup>* <sup>≥</sup> *<sup>g</sup>*) by <sup>γ</sup> *<sup>f</sup>* . Invariance then follows from γ (*e*) = 1 for any character <sup>γ</sup> <sup>∈</sup> *<sup>G</sup>*ˆ, which obviously implies (γ *f*)(*e*) = γ (*e*)*f*(*e*) = *f*(*e*). The Banach spaces *Lp*(*G*) and *Lp*(*G*ˆ) are then defined with respect to *dx* on *G* (assumed given) and *d*γ on *G*ˆ (as above), respectively.

Furthermore, the proof uses an approximate unit (δ*<sup>U</sup>* ) of *C*∗(*G*) that lies in *Cc*(*G*) and is indexed by shrinking neighbourhoods *U* of *e* ∈ *G*. More precisely, take the directed set of all symmetric neighbourhoods of *e* (i.e., *U*−<sup>1</sup> = *U*), ordered by reverse inclusion ⊇, take positive functions *hU* ∈ *Cc*(*W*) for some neighbourhood *W* of *<sup>e</sup>* satisfying *<sup>W</sup>*<sup>2</sup> <sup>⊂</sup> *<sup>U</sup>*, normalize *hU* such that *<sup>G</sup> hU* ∗ *h*<sup>∗</sup> *<sup>U</sup>* = 1, and define

$$
\delta\_U = h\_U \* h\_U^\*; \tag{C.393}
$$

$$f\_U = f \* \delta\_U \ (f \in \mathcal{C}^\*(G)). \tag{C.394}$$

We will show that for each *f* ∈ *C*∗(*G*), we have

$$\lim\_{U} \|f\_U - f\|\_{\mathcal{C}^\*} = 0.\tag{C.395}$$

To this end, we first show that δ*<sup>U</sup> C*<sup>∗</sup> ≤ 1, which follows from the estimate

$$\left\| \left| \mathfrak{a} (\delta\_U) \Psi \right| \right\| = \left\| \int\_G d\mathfrak{y} \,\delta\_U(\mathfrak{y}) L\_\mathfrak{Y} \Psi \right\| \le \int\_G d\mathfrak{y} \,\delta\_U(\mathfrak{y}) \left\| L\_\mathfrak{Y} \Psi \right\| = \left\| \Psi \right\|. \tag{C.396}$$

Similar estimates give *f* ∗ δ*<sup>U</sup>* → *f* for *f* ∈ *Cc*(*G*), so that finally

$$\begin{split} \|f \ast \delta\_U - f\|\_{\mathcal{C}^\ast} &= \|f \ast \delta\_U - \mathbf{g} \ast \delta\_U + \mathbf{g} \ast \delta\_U - \mathbf{g} + \mathbf{g} - f - f\|\_{\mathcal{C}^\ast} \\ &\leq 2\|f - \mathbf{g}\|\_{\mathcal{C}^\ast} + \|\mathbf{g} \ast \delta\_U - \mathbf{g}\|\_{\mathcal{C}^\ast} . \end{split} \tag{\text{C.397}}$$

Taking *g* ∈ *Cc*(*G*), an ε/3 argument finishes the proof of (C.395). Moreover,

$$f\_U \in \mathcal{C}\_0^\*(G) \ (f \in \mathcal{C}^\*(G)). \tag{C.398}$$

To prove this, take *<sup>g</sup>*,*<sup>h</sup>* <sup>∈</sup> *Cc*(*G*). Regarding *<sup>g</sup>* and *<sup>h</sup>* as elements of *<sup>L</sup>*2(*G*), note that

$$\mathbf{g} \ast h(\mathbf{x}) = \langle \mathbf{g}^\*, L\_{\mathbf{x}^{-1}} h \rangle\_{L^2(G)},\tag{\text{C.399}}$$

so that Cauchy–Schwarz and unitarity of *Lx*−<sup>1</sup> give *g* ∗ *h*<sup>∞</sup> ≤ *g*2*h*2. Applying this with *g* π(*f*)*g* and *h h*, where *f* ∈*C*∗(*G*), *g* ∈*Cc*(*G*), and *h* ∈*Cc*(*G*), yields

$$\|\|f\*\mathbf{g}\*h\|\|\_{\\*\*\infty} \le \|\pi(f)\mathbf{g}\|\|\_{2} \|h\|\_{2} \le \|f\|\_{C^{\ast}} \|\|\mathbf{g}\|\|\_{2} \|h\|\_{2};\tag{\text{C.400}}$$

$$\|\|f\*\mathbf{g}\*h\|\|\_{2} = \|\|\pi(f)(\mathbf{g}\*h)\|\|\_{2} \le \|\|f\|\|\_{\mathcal{C}^\*}\|\|\mathbf{g}\*h\|\|\_{2}.\tag{\text{C.401}}$$

Eq. (C.401) will be applied later, in the proof of (C.379). Eq. (C.400) shows that if *fn* → *f* in *C*∗(*G*) for some net (*fn*) in *Cc*(*G*), then *fn* ∗*g*∗*h* → *f* ∗*g*∗*h* uniformly, so that *f* ∗ *g* ∗ *h* ∈ *C*0(*G*) and *fn* ∗ *g* ∗ *h* → *f* ∗ *g* ∗ *h* in *C*0(*G*). Also,

$$\|\|\overline{f\ast g\ast h}\|\|\_{\\*\*\ast} = \|\|\hat{f}\hat{g}\hat{h}\|\|\_{\\*\*\ast} \le \|\|\hat{f}\|\|\_{\\*\*\ast} \|\hat{g}\hat{h}\|\|\_{\\*\*\ast} = \|\|f\|\|\_{C^{\ast}} \|\|\hat{g}\hat{h}\|\|\_{\\*\*\ast},\tag{C.402}$$

by isometry of the Gelfand transform, so that also *fn* <sup>∗</sup> *<sup>g</sup>* <sup>∗</sup> *<sup>h</sup>* <sup>→</sup> *<sup>f</sup>*<sup>∗</sup> *<sup>g</sup>* <sup>∗</sup> *<sup>h</sup>* in *<sup>C</sup>*0(*G*ˆ). If *fn*,*g*,*h* ∈ *Cc*(*G*) then *fn* ∗ *g* ∗ *h* ∈ *Cc*(*G*) ⊂ *C*<sup>∗</sup> <sup>0</sup> (*G*), and the above computations give *fn* ∗ *g* ∗ *h* → *f* ∗ *g* ∗ *h* in *C*<sup>∗</sup> <sup>0</sup> (*G*). This shows that *f* ∗ *g* ∗ *h* ∈*C*<sup>∗</sup> <sup>0</sup> (*G*); taking *g* = *hU* and *h* = *h*∗ *<sup>U</sup>* yields (C.398).

We now turn to the Fourier inversion formula (C.381). Since the Gelfand transform *<sup>C</sup>*∗(*G*) <sup>→</sup>*C*0(*G*ˆ) is an isomorphism, for any <sup>ζ</sup> <sup>∈</sup>*C*0(*G*ˆ), we can find *<sup>f</sup>* <sup>∈</sup>*C*∗(*G*) such that <sup>ˆ</sup>*<sup>f</sup>* <sup>=</sup> <sup>ζ</sup> , and we can find a net *fU* <sup>=</sup> *<sup>f</sup>* <sup>∗</sup> <sup>δ</sup>*<sup>U</sup>* in *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*) such that

$$\lim\_{U} \|f\_U - f\|\_{\mathcal{C}^\*} = \lim\_{U} \|\hat{f}\_U - \hat{f}\|\_{C\_0(\hat{G})} = \lim\_{U} \|\hat{f}\_U - \hat{f}\|\_{\infty} = 0. \tag{C.403}$$

If <sup>ζ</sup> <sup>∈</sup> *Cc*(*G*ˆ), we in addition have ˇˆ*fU* <sup>→</sup> <sup>ˇ</sup> <sup>ζ</sup> in *<sup>C</sup>*0( <sup>ˆ</sup> *G*ˆ), or, equivalently,

$$\lim\_{U} \|\hat{f}\_U - \hat{f}\|\_{C^\*(\hat{G})} = 0.\tag{C.404}$$

Eq. (C.403) and the fact that ˆ δ*<sup>U</sup>* is continuous, and hence uniformly continuous on every compact *<sup>K</sup>* <sup>⊂</sup> *<sup>G</sup>*<sup>ˆ</sup> (which we take such that it contains the support of <sup>ˆ</sup>*<sup>f</sup>* <sup>=</sup> <sup>ζ</sup> ), gives lim*<sup>U</sup>* <sup>ˆ</sup> δ*<sup>U</sup>* −1 (*K*) <sup>∞</sup> <sup>=</sup> 0, where η (*K*) <sup>∞</sup> is the supremum of <sup>|</sup>η(γ)<sup>|</sup> over all <sup>γ</sup> <sup>∈</sup> *<sup>K</sup>*. For <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *Cc*(*G*ˆ), with <sup>ˆ</sup>*fU* <sup>=</sup> <sup>ˆ</sup> <sup>δ</sup>*<sup>U</sup>* <sup>ˆ</sup>*<sup>f</sup>* , this gives <sup>ˆ</sup>*fU* <sup>→</sup> <sup>ˆ</sup>*<sup>f</sup>* in *<sup>L</sup>*1(*G*ˆ). As we trivially have <sup>ˆ</sup>*<sup>f</sup> C*∗(*G*ˆ) ≤ <sup>ˆ</sup>*<sup>f</sup> L*1(*G*ˆ) (and similarly, of course, on *<sup>G</sup>* itself), we obtain (C.404), which together with (C.403) also yields <sup>ˆ</sup>*fU* <sup>→</sup> <sup>ζ</sup> in *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*ˆ).

Since *fU* ∈ *C*<sup>∗</sup> <sup>0</sup> (*G*), the infimum in (C.392) is saturated, and hence the Fourier inversion formula (C.381) holds for *fU* . Pontryagin duality then yields isometry, i.e., <sup>ˆ</sup>*fU* <sup>0</sup> <sup>=</sup> *fU* 0. Convergence of <sup>ˆ</sup>*fU* in *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*ˆ) therefore yields convergence of *fU* in *C*<sup>∗</sup> <sup>0</sup> (*G*), necessarily to *f* , since we already knew that *fU* → *f* in *C*∗(*G*), cf. (C.395). This shows that *f* ∈ *C*<sup>∗</sup> <sup>0</sup> (*G*), so that (C.381) holds for *f* , implying

$$\|\hat{f}\|\_{0} = \|f\|\_{0}.\tag{C.405}$$

Thus the Fourier transform <sup>F</sup> : *<sup>C</sup>*∗(*G*) <sup>→</sup> *<sup>C</sup>*0(*G*ˆ) from Theorem C.109 is given by continuous extension of <sup>F</sup>(*f*) = <sup>ˆ</sup>*<sup>f</sup>* as defined by (C.356), where *<sup>f</sup>* <sup>∈</sup> *Cc*(*G*).

To prove (C.380), let *B*(*G*) be the set of all *f* ∈ *C*<sup>∗</sup> <sup>0</sup> (*G*) for which <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *Cc*(*G*ˆ), and let *B*(*G*)− be its closure in *C*∗ <sup>0</sup> (*G*). Then F restricts to an isometric isomorphism *<sup>B</sup>*(*G*) <sup>→</sup> *Cc*(*G*ˆ), and hence also to an isometric isomorphism *<sup>B</sup>*(*G*)<sup>−</sup> <sup>→</sup> *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*ˆ); we recall that (by definition) *C*∗ <sup>0</sup> (*G*ˆ) is the completion of *Cc*(*G*ˆ) in its norm ·0.

Repeating this construction for *G*ˆ instead of *G*, and using Pontryagin duality (C.389) with the ensuing isomorphisms *<sup>C</sup>*∗( <sup>ˆ</sup> *G*ˆ) ∼=*C*∗(*G*) etc., we also have a Fourier transform <sup>F</sup><sup>ˇ</sup> : *<sup>C</sup>*∗(*G*ˆ) <sup>→</sup> *<sup>C</sup>*0(*G*). Since the Fourier inversion formula (C.381) holds on *Cc*(*G*ˆ), we see that Fˇ maps *Cc*(*G*ˆ) isometrically to *B*(*G*) and hence by continuity maps *C*∗ <sup>0</sup> (*G*ˆ) to *<sup>B</sup>*(*G*)−. At the same time, <sup>F</sup><sup>ˇ</sup> maps *<sup>B</sup>*ˆ(*G*ˆ) (defined, *mutatis mutandis*, like *B*(*G*)) to *Cc*(*G*), and hence maps *B*ˆ(*G*ˆ)<sup>−</sup> to *C*<sup>∗</sup> <sup>0</sup> (*G*). Since *<sup>B</sup>*ˆ(*G*ˆ)<sup>−</sup> <sup>⊆</sup> *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*ˆ), this implies *B*(*G*)− = *C*∗ <sup>0</sup> (*G*) and *<sup>B</sup>*ˆ(*G*ˆ)<sup>−</sup> <sup>=</sup> *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*ˆ). This proves (C.380).

Returning to (C.381), we know from the above analysis that (C.356) and (C.381) hold if *f* ∈*C*<sup>∗</sup> <sup>0</sup> (*G*) and <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup>*Cc*(*G*ˆ). If *<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*1(*G*), then, by Lebesgue integration theory, eq. <sup>F</sup>(*f*) remains given by (C.356). If also <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*1(*G*ˆ), then <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>L</sup>*1(*G*ˆ)∩*C*0(*G*ˆ) and hence <sup>ˆ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*ˆ), cf. (C.378). By (C.380), there exists ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∗</sup> <sup>0</sup> (*G*) such that *<sup>f</sup>* <sup>=</sup> ˜*<sup>f</sup>* in *C*∗(*G*), and hence for a.e. *x* ∈ *G* (with respect to Haar measure), we have

$$f(\mathbf{x}) = \lim\_{U} f \ast \delta\_U(\mathbf{x}) = \lim\_{U} \tilde{f} \ast \delta\_U(\mathbf{x}) = \tilde{f}(\mathbf{x}).\tag{C.406}$$

It follows that *f* = ˜*f* a.e., and so the inversion formula (C.382), and hence (C.381), holds, provided (if necessary) *f* is replaced by its representative ˜*f* .

Finally, to prove (C.379), take *f* = ψ in (C.356) in *Cc*(*G*), so that we may compute

$$\|\|\boldsymbol{\Psi}\|\|\_{2}^{2} = \int\_{G} d\boldsymbol{x} \, |\boldsymbol{\Psi}(\boldsymbol{x})|^{2} = \boldsymbol{\Psi} \* \boldsymbol{\Psi}^{\*}(\boldsymbol{e}) = \int\_{G} d\boldsymbol{\gamma} \, \widehat{\boldsymbol{\Psi}^{\*} \ast \boldsymbol{\Psi}^{\*}}(\boldsymbol{\gamma}) = \int\_{G} d\boldsymbol{\gamma} |\boldsymbol{\Psi}|^{2} = \|\boldsymbol{\hat{\Psi}}\|\_{2}^{2}. \tag{C.407}$$

We may therefore extend F, initially given by F(*f*) = ˆ*f* , from *Cc*(*G*) to its competion *<sup>L</sup>*2(*G*) in ·2. Second, we prove surjectivity similarly to the previous part:

Pick <sup>ζ</sup> <sup>∈</sup> *Cc*(*G*ˆ), and hence *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∗(*G*) with <sup>ˆ</sup>*<sup>f</sup>* <sup>=</sup> <sup>ζ</sup> . Then *fU* <sup>=</sup> *<sup>f</sup>* <sup>∗</sup> <sup>δ</sup>*<sup>U</sup>* <sup>∈</sup> *<sup>L</sup>*2(*G*), as follows from (C.401). Then <sup>ˆ</sup>*fU* <sup>→</sup> <sup>ˆ</sup>*<sup>f</sup>* in *<sup>L</sup>*2(*G*ˆ), since analogously to the previous proof, we find that ( ˆ*fU* ) is a Cauchy net in *L*2(*G*ˆ). By isometry of F (as just proved), this implies that (*fU* ) is a Cauchy net in *<sup>L</sup>*2(*G*). Let *fU* <sup>→</sup> *<sup>g</sup>* in *<sup>L</sup>*2(*G*); continuity of F gives F(*g*) = ζ , making F surjective at least onto *Cc*(*G*ˆ). Since *L*2(*G*ˆ) is the completion of*Cc*(*G*ˆ) in the *<sup>L</sup>*2-norm ·2, the Fourier transform <sup>F</sup> : *<sup>L</sup>*2(*G*) <sup>→</sup> *<sup>L</sup>*2(*G*ˆ) is an isometric surjection, and hence is unitary. -

We close this section with the SNAG-Theorem (named after Stone, whose Theorem 5.73 it generalizes, Naimark, Ambrose, and Godement, each of who published versions of it in 1944). This theorem uses projection-valued measures, which we have avoided so far, but which are appropriate here as well as in our application of the SNAG-Theorem to the Goldstone Theorem 10.28. Recall that the Riesz–Radon representation theorems B.19 and B.24 establish a bijective correspondence between *states* on *C*0(*X*) and *probability measures* on *X*. There is a similar correspondence between *representations* of *C*0(*X*) and *projection-valued measures* on *X*. Cf. §B.4.

Definition C.112. *Let X be a set with* σ*-algebra* Σ ⊆ P(*X*)*, and H a Hilbert space. A* projection-valued measure *for* (*X*,Σ,*H*) *is a map e* : Σ → P(*H*) *such that for each unit vector* <sup>ψ</sup> <sup>∈</sup> *H, the map e*(ψ) : <sup>Σ</sup> <sup>→</sup> [0,1] *defined by*

$$e^{(\Psi)}(A) = \langle \Psi, e(A)\Psi \rangle,\tag{C.408}$$

*is a probability measure. Equivalently, e*(0/) = 0*H, e*(*X*) = 1*H, e*(*A*∩*B*) = *e*(*A*)*e*(*B*)*, and e*(∪*nAn*) = ∑*<sup>n</sup> e*(*An*) *for pairwise disjoint An in the strong topology on B*(*H*)*.*

The simplest example must be *<sup>H</sup>* <sup>=</sup> *<sup>L</sup>*2(*X*,Σ,μ) with *<sup>e</sup>*(*A*) = <sup>1</sup>*A*, cf. §B.6.

As in (B.328), one can integrate any bounded measurable function *f* : *X* → C "against" *e*, i.e., there is a unique operator *<sup>X</sup> de f* such that for any ε > 0 there is a finite partition *X* = *<sup>n</sup> <sup>i</sup>*=<sup>1</sup> *Ai* of *X* into *n* Borel sets *Ai*, such that for any *xi* ∈ *Ai*,

$$\left\| \int\_{X} de \, f - \sum\_{i=1}^{n} f(\mathbf{x}\_{i}) e(A\_{i}) \right\| < \varepsilon. \tag{C.409}$$

Analogously to the Riesz–Radon representation theorem, one may then prove:

Theorem C.113. *Let X be a locally compact Hausdorff space. There is a bijective correspondence between non-degenerate representations* π : *C*0(*X*) → *B*(*H*) *and projection-valued measures e for* (*X*,Σ,*H*) *(where* Σ *is the Borel* σ*-algebra), viz.*

$$
\mathfrak{a}(f) = \int\_X de \, f;\tag{C.410}
$$

$$e(A) = \pi(1\_A),\tag{C.411}$$

*where* π(1*A*) *is defined by extending* π *from C*0(*X*) *to the C\*-algebra* B(*X*) *of bounded Borel functions on X (cf. Theorem B.102 and Proposition B.98).*

We finally need the existence of a bijective correspondence between continuous unitary representations *u* of *G* and non-degenerate representations of *C*∗(*G*) given by (C.506) in §C.18 below; see the comment below Definition C.119. Combined with Theorems C.109 and C.113, we then obtain the SNAG*-Theorem*:

Theorem C.114. *There is a bijective correspondence between continuous unitary representations u of a locally compact abelian group G on some Hilbert space H and projection-valued measures e* : <sup>B</sup>(*G*ˆ) <sup>→</sup> <sup>P</sup>(*H*) *on the dual group G, such that* <sup>ˆ</sup>

$$
\mu(\mathbf{x}) = \int\_{\hat{G}} de(\mathbf{y})\,\mathcal{Y}(\mathbf{x}).\tag{\text{C.412}}
$$

#### C.16 Intermezzo: Lie groupoids

Groupoids generalize groups, group actions, and equivalence relations. As such, they provide a more flexible language for dealing with symmetries than either of these. Like Lie groups, one also has Lie groupoids, which form an important tool in constructing continuous bundles of C\*-algebras (see §C.19 below). These, in turn, provide the mathematical foundation of (deformation) quantization, see Chapter 7.

Definition C.115. *A* groupoid *G* = (*G*1,*G*0,*s*,*t*,*i*,*I*) *is a small category (i.e. a category in which the underlying classes are sets, cf.* §*E.1) in which each arrow is invertible. Thus one has a set (of arrows) G*<sup>1</sup> *doubly fibered over some base space G*<sup>0</sup> *through source, and target maps s*,*t* : *G*<sup>1</sup> → *G*0*. These maps define the set*

$$G\_2 = \{ (\mathbf{x}, \mathbf{y}) \in G\_1 \times G\_1 \mid \mathbf{s}(\mathbf{x}) = \mathbf{t}(\mathbf{y}) \}\tag{\mathbf{C.413}}$$

*of composable pairs, on which a multiplication m* : *G*<sup>2</sup> → *G*<sup>1</sup> *is defined, which we simply denote by xy* = *m*(*x*, *y*)*, subject to the axioms*

$$s(\text{xy}) = s(\text{y}); \ t(\text{xy}) = t(\text{x}) \ (\text{xy} \in G\_2); \tag{C.414}$$

$$(\text{xy})z = \text{x(yz)} \ (\text{xy} \in \mathcal{G}\_2, \text{yz} \in \mathcal{G}\_2), \tag{C.415}$$

*the third being well defined by virtue of the first and the second.*

*Furthermore, there is an object inclusion map i* : *G*<sup>0</sup> → *G*1, *u* → id*u, satisfying*

$$s(\text{id}\_{\mu}) = t(\text{id}\_{\mu}) = \mu \ (\mu \in G\_0);\tag{C.416}$$

$$\text{id}\_{s(x)} = \text{id}\_{\mathfrak{l}(x)} \mathfrak{x} = \mathfrak{x} \ (\mathfrak{x} \in G\_{\mathfrak{l}}).\tag{C.417}$$

*Finally, what makes a (small) category a groupoid is the existence of an inverse*

$$I: G\_{\mathbb{I}} \to G\_{\mathbb{I}}, \ \mathfrak{x} \mapsto \mathfrak{x}^{-1},$$

*satisfying*

$$s(\mathbf{x}^{-1}) = t(\mathbf{x});\ t(\mathbf{x}^{-1}) = s(\mathbf{x})\ (\mathbf{x} \in G\_1);\tag{\mathbf{C.418}}$$

$$\mathbf{x}^{-1}\mathbf{x} = \mathrm{id}\_{\mathbf{s}(\mathbf{x})};\ \mathbf{x}\mathbf{x}^{-1} = \mathrm{id}\_{\mathbf{t}(\mathbf{x})}\ \ (\mathbf{x} \in G\_{\mathbf{l}}).\tag{C.419}$$

*A* Lie groupoid *is a groupoid for which G*<sup>1</sup> *and G*<sup>0</sup> *are manifolds, s and t are surjective submersions, and multiplication and inversion are smooth.*

We often identify *u* with id*u*, so that *x*−1*x* = *s*(*x*), etc. We allow manifolds with boundary, which provide key examples; cf. Proposition C.117 below.

Proposition C.116. *In a Lie groupoid, object inclusion is an immersion, inversion is a diffeomorphism, G*<sup>2</sup> *is a closed submanifold of G*<sup>1</sup> ×*G*1*, and for each u* ∈ *G*0*, the fibers s*−1(*u*) *and t*−1(*u*) *are submanifolds of G*1*.*

Abusing notation, *G*<sup>1</sup> is often called *G*. Some basic examples of Lie groupoids are:


Any Lie groupoid *G* defines an associated *tangent groupoid GT* , which will play a crucial role in §C.19. We first explain the (surprising) underlying differential geometry in three steps of increasing complexity. We start with the manifold *M* = R*n*, with tangent bundle *TM* = R2*n*. Our goal is to describe a smooth structure on

$$F = TM \sqcup (0, 1] \times M \times M,\tag{C.420}$$

seen as a bundle over [0,1], where (as the notation already indicates) the fibers are

$$F\_0 = TM;\tag{C.421}$$

$$F\_{\hbar} = M \times M \ (\hbar > 0). \tag{C.422}$$

Although each fiber *Fh*¯ of this bundle is isomorphic to R2*n*, its smooth structure is not equal or even diffeomorphic to the usual one on [0,1]×R2*n*. Instead, we define

$$\phi: [0,1] \times TM \to TM \sqcup (0,1] \times M \times M;\tag{C.423}$$

$$
\phi(0,\xi) = \xi;\tag{C.424}
$$

$$\phi(\hbar,\xi) = (\hbar,\exp^W(\hbar\xi))\ (\hbar > 0),\tag{C.425}$$

where the symmetrized ("Weyl") exponential map exp*<sup>W</sup>* : *TM* <sup>→</sup> *<sup>M</sup>* <sup>×</sup>*<sup>M</sup>* is given by

$$\exp^W(\mathbf{x}, \nu) = (\mathbf{x} - \frac{1}{2}\nu, \mathbf{x} + \frac{1}{2}\nu). \tag{\text{C.426}}$$

Here the coordinates (*x*, *<sup>v</sup>*) of <sup>ξ</sup> <sup>∈</sup> *TxM* denote <sup>ξ</sup> *<sup>f</sup>*(*x*) = <sup>∑</sup>*<sup>i</sup> <sup>v</sup><sup>i</sup>* ∂ *f* ∂ *xi* (*x*) <sup>≡</sup> <sup>∑</sup>*<sup>i</sup> <sup>v</sup><sup>i</sup>* ∂*<sup>i</sup> f*(*x*).

Like its more familiar counterpart (*x*, *<sup>v</sup>*) → (*x*, *<sup>x</sup>*+*v*), exp*<sup>W</sup>* is a diffeomorphism.

For *M* = R*n*, our map φ is a bijection, with inverse given by

$$\boldsymbol{\phi}^{-1}(\mathbf{x}, \nu) = (\mathbf{0}, \mathbf{x}, \nu); \tag{\text{C.427}}$$

$$\boldsymbol{\phi}^{-1}(\hbar, \mathbf{x}, \mathbf{y}) = \left(\hbar, \frac{\mathbf{x} + \mathbf{y}}{2}, \frac{\mathbf{y} - \mathbf{x}}{\hbar}\right) \ (\hbar > 0). \tag{C.428}$$

We use this to transfer the product topology (and also the smooth structure as a manifold with boundary) from [0,1]×*TM* to *F*. Then a sequence (*h*¯ *<sup>n</sup>*, *xn*, *yn*) in *F*, where *h*¯ *<sup>n</sup>* → 0, converges iff *xn* → *x*, *yn* → *x* for some *x* ∈ *M*, and (*yn* −*xn*)/*h*¯ *<sup>n</sup>* → *v*, in which case (*h*¯ *<sup>n</sup>*, *xn*, *yn*) → (0, *x*, *v*). More abstractly, *F* has two key properties:

1. The map *F* → [0,1]×*M* × *M* given by

$$(\mathbf{x}, \nu) \qquad \mapsto (\mathbf{0}, \mathbf{x}, \mathbf{x}); \tag{\mathbf{C.429}}$$

$$(\hbar, \mathbf{x}, \mathbf{y}) \mapsto (\hbar, \mathbf{x}, \mathbf{y}) \ (\hbar > 0), \tag{C.430}$$

is smooth. Indeed, as a map [0,1]×*TM* → [0,1]×*M* × *M*, this map is given by

$$(\mathbf{0}, \mathbf{x}, \nu) \mapsto (\mathbf{0}, \mathbf{x}, \mathbf{x});\tag{\mathbf{C.431}}$$

$$(\hbar, \mathbf{x}, \nu) \mapsto (\hbar, \mathbf{x} - \frac{1}{2}\hbar\nu, \mathbf{y} + \frac{1}{2}\hbar\nu). \tag{\text{C.432}}$$

2. For any *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞(*<sup>M</sup>* <sup>×</sup> *<sup>M</sup>*) that vanishes on the diagonal

$$\Delta(M) = \{(\mathbf{x}, \mathbf{x}) \mid \mathbf{x} \in M\} \subset M \times M,\tag{\mathsf{C.433}}$$

the function δ *f* on *F* defined by

$$
\delta f(\mathbf{x}, \nu) = \sharp\_{\perp} f(\mathbf{x}, \mathbf{x}); \tag{\text{C.434}}
$$

$$
\delta f(\hbar, \mathbf{x}, \mathbf{y}) = f(\mathbf{x}, \mathbf{y}) / \hbar \ (\hbar > 0),
\tag{\text{C.435}}
$$

where the tangent vector ξ<sup>⊥</sup> ∈ *T*(*x*,*<sup>x</sup>*)(*M* × *M*) has components (−<sup>1</sup> <sup>2</sup> *v*, <sup>1</sup> <sup>2</sup> *v*), is smooth. Indeed, as a function on [0,1]×*TM*, the pullback δ <sup>∗</sup> *f* ≡ δ *f* ◦φ is given by

$$
\delta^\* f(0, \mathbf{x}, \nu) = \xi\_\perp f(\mathbf{x}, \mathbf{x});\tag{\mathbf{C.436}}
$$

$$\delta^\* f(\hbar, \mathbf{x}, \nu) = f(\hbar, \mathbf{x} - \frac{1}{2}\hbar\nu, \mathbf{y} + \frac{1}{2}\hbar\nu) / \hbar,\tag{\text{C.437}}$$

which is smooth given our assumptions on *f* .

A similar construction works for any (smooth) manifold *M*, except that the smooth structure on *F* may no longer be definable in terms of a single map φ. Instead, we invoke a special case of the well-known *tubular neighbourhood theorem* of Riemannian (or, more generally, affine) geometry, which states that *M*, identified with the zero section in its tangent bundle *TM*, has an open neighbourhood *U* such that the (symmetrized) exponential map exp*<sup>W</sup>* : *<sup>U</sup>* <sup>→</sup> *<sup>M</sup>* <sup>×</sup> *<sup>M</sup>* is a diffeomorphism onto its image. Here exp*<sup>W</sup>* (<sup>ξ</sup> )=(γ(−<sup>1</sup> <sup>2</sup> ), γ( <sup>1</sup> <sup>2</sup> )), where ξ ∈ *TxM* and γ is the unique affinely parametrized geodesic with γ(0) = *x* and γ˙(0) = ξ . We now replace the space [0,1]×*TM* used in the special case *<sup>M</sup>* <sup>=</sup> <sup>R</sup>*<sup>n</sup>* by the pair of spaces

$$V\_1 = \{ (\hbar, \tilde{\xi}) \in [0, 1] \times TM \mid \hbar \tilde{\xi} \in U \};\tag{C.438}$$

$$V\_2 = (0,1] \times M \times M,\tag{C.439}$$

with associated maps φ<sup>1</sup> : *V*<sup>1</sup> → *F* and φ<sup>2</sup> : *V*<sup>2</sup> → *F* defined by

728 C Operator algebras

$$
\phi\_1(0,\xi) = \xi;\tag{C.440}
$$

$$\phi\_{\mathbf{l}}(\hbar,\xi) = \exp^{W}(\hbar\xi)\ (\hbar > 0);\tag{\text{C.441}}$$

$$
\phi\_2(\hbar, \mathbf{x}, \mathbf{y}) = (\hbar, \mathbf{x}, \mathbf{y}) \ (\hbar > 0). \tag{\text{C.442}}
$$

Then φ<sup>1</sup> and φ<sup>2</sup> are injective and, writing *Fi* = ϕ*i*(*Vi*), we have *F* = *F*<sup>1</sup> ∪*F*2, which is far from a disjoint union; let *Fi j* = *Fi*∩*Fj*. Also, let *Vi j* = {α ∈*Vi* | φ*i*(α) ∈ *Fi j*}, with associated maps φ*i j* = φ−<sup>1</sup> *<sup>j</sup>* ◦φ*<sup>i</sup>* : *Vi j* →*Vji*. We now define the smooth structure of *F* by declaring *<sup>f</sup>* : *<sup>F</sup>* <sup>→</sup> <sup>R</sup> to be smooth iff *fi* ◦φ−<sup>1</sup> *<sup>i</sup>* :*Vi* → R is smooth, *i* = 1,2, where *fi* is the restriction of *f* to *Fi*. These conditions are compatible on the overlap *Fi j*, since φ12(*h*¯,ξ ) = exp*<sup>W</sup>* (*h*¯ξ ) is a diffeomorphism (with inverse φ21). This smooth structure may also be defined by imposing conditions 1 and 2 above, *mutatis mutandis*. In particular, (C.434) should now read

$$\delta f(\tilde{\xi}) = \tilde{\xi}\_{\perp} f(\mathbf{x}, \mathbf{x}); \ \tilde{\xi}\_{\perp} = (-\frac{1}{2}\tilde{\xi}, \frac{1}{2}\tilde{\xi}) \in T\_{(\mathbf{x}, \mathbf{x})}(\mathbf{M} \times \mathbf{M}) \cong T\_{\mathbf{x}}\mathbf{M} \oplus T\_{\mathbf{x}}\mathbf{M}. \quad (\mathbf{C.443})$$

A more general form of the above construction, which will be used to generate a vast class of continuous bundles of C\*-algebras, is as follows. Let *M* be a closed submanifold of another manifold *G* (in the above situation we take *G* = *M* ×*M* and identify *M* with Δ(*M*)), and replace *TM* above by the *normal bundle*

$$N\_M G = T\_M G / T\_M M,\tag{C.444}$$

i.e., the quotient of the restriction *TMG* of the tangent bundle *T G* to *M* ⊂ *G* by its subbundle *TMM* ∼= *TM*; hence the fiber of *NMG* at *x* ∈ *M* ⊂ *G* is *TxG*/*TxM*.

In the above case *G* = *M* × *M*, one therefore has

$$N\_M(M \times M) \cong TM,\tag{C.445}$$

through the isomorphism [(ξ1,ξ2)] → <sup>1</sup> <sup>2</sup> (ξ<sup>2</sup> − ξ1), where (ξ1,ξ2) ∈ *T*(*x*,*<sup>x</sup>*)(*M* × *M*) and [(ξ1,ξ2)] is its equivalence class in the quotient *T*(*x*,*<sup>x</sup>*)(*M* ×*M*)/*T*(*x*,*<sup>x</sup>*)(Δ(*M*)).

Other easy examples are Lie groups *<sup>G</sup>*, for which *NMG* <sup>=</sup> *TeG* <sup>=</sup> <sup>g</sup> is just the Lie algebra of *G* ( at least as a vector space), and *G* = *M*, for which *NMG* = *M*.

For the bundle *F*, defined over *I* = [0,1], we take the fibers and total space as

$$F\_0 = N\_M G;\tag{C.446}$$

$$F\_{\hbar} = G\left(\hbar > 0\right);\tag{C.447}$$

$$F = N\_M G \sqcup (0, 1] \times G. \tag{C.448}$$

Once again, there are two equivalent ways to define a smooth structure on *F*. The first uses a more general version of the tubular neighbourhood theorem from differential geometry, which states that *M* ⊂ *NMG* (seen as its zero section) has an open neighbourhood *U* that is diffeomorphic to some open neighbourhood *U* of *M* ⊂ *G* via a diffeomorphism ϕ that maps *M* to itself (i.e., pointwise). Then put

$$V\_1 = \{ (\hbar, \tilde{\xi}) \in [0, 1] \times N\_M G \mid \hbar \tilde{\xi} \in U \};\tag{C.449}$$

$$V\_2 = (0,1] \times G,\tag{C.450}$$

again with associated maps φ<sup>1</sup> : *V*<sup>1</sup> → *F* and φ<sup>2</sup> : *V*<sup>2</sup> → *F*, this time defined by

$$\phi\_1(0,\xi) = \xi;\tag{C.451}$$

$$\phi\_{\mathbf{l}}(\hbar,\xi) = \mathfrak{q}(\hbar\xi)\ (\hbar > 0);\tag{\text{C.452}}$$

$$
\phi\_2(\hbar, n) = (\hbar, n) \ (\hbar > 0). \tag{C.453}
$$

One then proceeds exactly as above. Equivalently, we impose that:


$$
\delta f(\xi) = \xi f;\tag{C.454}
$$

$$
\delta f(\hbar, n) = f(n) / \hbar \ (\hbar > 0),
\tag{C.455}
$$

is smooth (note that ξ *f* is well defined despite the fact that ξ ∈ *TMG*/*TMM* rather than ξ ∈ *TMG*, since any two representatives of ξ in *TMG* differ by vectors in *TMM*, which vanish on *f* because *f*|*<sup>M</sup>* = 0 by assumption).

After this preparation, we are at last in a position to define tangent groupoids.

Proposition C.117. *Any Lie groupoid G over some base space G*<sup>0</sup> = *M defines an associated* tangent groupoid *GT , with total space GT* = *F, cf.* (C.448)*, with smooth structure as explained, base space GT* <sup>0</sup> = [0,1]× *M, source and target projections*

$$\mathbf{s}^{T}(\xi) = \underline{\mathbf{r}}^{T}(\xi) = (0, \pi(\xi)) \ (\hbar = 0);\tag{\text{C.456}}$$

$$\mathbf{s}^T(\hbar, \mathbf{x}) = (\hbar, \mathbf{s}(\mathbf{x})) \ (\hbar > 0);\tag{\text{C.457}}$$

$$\mathbf{t}^T(\hbar, \mathbf{x}) = (\hbar, t(\mathbf{x})) \ (\hbar > 0), \tag{\text{C.458}}$$

*where* π : *TMG*/*TMM* → *M is the bundle projection, and x* ∈ *G, multiplication*

$$
\xi \cdot \eta = \xi + \eta \ (\hbar = 0); \tag{C.459}
$$

$$(\hbar, \mathbf{x}) \cdot (\hbar, \mathbf{y}) = (\hbar, \mathbf{x}\mathbf{y}) \ (\hbar > 0),\tag{\mathbf{C.460}}$$

*and inverse*

$$
\xi^{-1} = -\tilde{\xi} \ (\hbar = 0); \tag{C.461}
$$

$$(\hbar, \mathbf{x})^{-1} = (\hbar, \mathbf{x}^{-1}) \ (\hbar > 0). \tag{\text{C.462}}$$

In other words, *G<sup>T</sup>* , seen as a bundle over [0,1] is a "bundle of groupoids": the groupoid above *h*¯ = 0 is the normal bundle π : *NMG* → *M*, as in the vector bundle example above, whereas the fibers above *h*¯ > 0 are *G* itself.

#### C.17 C\*-algebras associated to Lie groupoids

One may associate two C\*-algebras to a Lie groupoid *G*, called *C*∗ *<sup>r</sup>* (*G*) and *C*∗(*G*), which coincide for abelian Lie groups, and as such generalize the construction in §C.15, cf. (C.332) and (C.336) - (C.337). We first generalize the Haar measure.

Definition C.118. *A* Haar system *on a Lie groupoid G is a family of measures* (μ*u*,*<sup>u</sup>* <sup>∈</sup> *<sup>G</sup>*0)*, where* <sup>μ</sup>*<sup>u</sup> is defined on the t-fiber*

$$G^{\mu} = t^{-1}(\mu),\tag{C.463}$$

*where it is locally equivalent to Lebesgue measure, and each function*

$$
\mu \mapsto \int\_{G^\mu} d\mu^\mu f \; (f \in C\_c^\infty(G)) \tag{C.464}
$$

*on G*<sup>0</sup> *is smooth. A Haar system is* left-invariant *if for each f* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G*) *and x* ∈ *G,*

$$\int\_{G^{(\mathbf{x})}} d\mu^{\mathbf{f}(\mathbf{x})}(\mathbf{y}) \, f(\mathbf{y}) = \int\_{G^{(\mathbf{x})}} d\mu^{\mathbf{s}(\mathbf{x})}(\mathbf{y}) \, f(\mathbf{x}\mathbf{y}).\tag{\text{C.465}}$$

It is sometimes convenient to regard μ*<sup>u</sup>* as a measure on all of *G* but having support in *Gu*. Either way, *any Lie groupoid possesses a left-invariant Haar system*, briefly called a *left Haar system*. For example, if *G* is a Lie group, *u* ∈ *G*<sup>0</sup> can only be the identity *e* ∈ *G*, so that a left-invariant Haar system is the same as a left-invariant Haar measure on *G* (which exists on any locally compact group). Furthermore:


$$t^{-1}(\mu) = T\_{\mu}M,\tag{C.466}$$

to be translation invariant. For *M* = R*<sup>n</sup>* (or, more generally, if *TM* is a trivial bundle), we take all μ*<sup>u</sup>* to be the same and all equal to Lebesgue measure.


C.17 C\*-algebras associated to Lie groupoids 731

$$
\mu^{(0,u)} = \mu\_0^u \ (\hbar = 0); \tag{C.467}
$$

$$
\mu^{(\hbar,\mu)} = \hbar^{-n} \mu^{\mu} \ (\hbar > 0),
\tag{C.468}
$$

where *<sup>n</sup>* <sup>=</sup> dim(*G*)−dim(*M*), defines a Haar system on *GT* ; the extra factor *<sup>h</sup>*¯−*<sup>n</sup>* in (C.468) is necessary and sufficient for this Haar system to satisfy the smoothness condition on (C.464). For example, if *<sup>G</sup>* <sup>=</sup> <sup>R</sup>*<sup>n</sup>* <sup>×</sup>R*<sup>n</sup>* is the pair groupoid on R*n*, where each fiber *G<sup>u</sup>* ∼= R*<sup>n</sup>* is endowed with Lebesgue measure *dnx*, then the fibers R*<sup>n</sup>* of the vector bundle *NMG* ∼= *T*R*<sup>n</sup>* should carry exactly the same measure. To see this, in (C.464) we substitute *<sup>G</sup> <sup>G</sup><sup>T</sup>* and *<sup>u</sup>* (*h*¯, *<sup>y</sup>*) (*<sup>y</sup>* <sup>∈</sup> <sup>R</sup>*n*), so that for each *<sup>f</sup>* <sup>∈</sup>*C*<sup>∞</sup> *<sup>c</sup>* (*G<sup>T</sup>* ) the following function on [0,1]×R*<sup>n</sup>* should be smooth:

$$f(\mathbf{0}, \mathbf{y}) \mapsto \int\_{\mathbb{R}^n} d^n \mathbf{v} \, f(\mathbf{0}, \mathbf{y}, \mathbf{v});\tag{\mathbf{C}.469}$$

$$f(\hbar, \mathbf{y}) \mapsto \hbar^{-n} \int\_{\mathbb{R}^n} d^n \mathbf{x} \, f(\hbar, \mathbf{x}, \mathbf{y}) \, (\hbar > 0). \tag{\text{C.470}}$$

To interpret this condition, we put *<sup>f</sup>* <sup>=</sup> ˜*<sup>f</sup>* ◦φ−1, where ˜*<sup>f</sup>* is smooth on [0,1]×*T*R*n*, and φ−<sup>1</sup> is given by (C.427) - (C.428). This transforms the above function into

$$f(\mathbf{0}, \mathbf{y}) \mapsto \int\_{\mathbb{R}^n} d^n \nu \, \tilde{f}(\mathbf{0}, \mathbf{y}, \nu);\tag{\mathbf{C.471}}$$

$$(\hbar, \mathbf{y}) \mapsto \int\_{\mathbb{R}^n} d^n \nu \,\tilde{f}(\hbar, \mathbf{y} - \frac{1}{2}\hbar \mathbf{v}, \mathbf{v}) \,\,(\hbar > 0). \tag{C.472}$$

We now define C\*-algebras *C*∗(*G*) and *C*∗ *<sup>r</sup>* (*G*), which depend on the choice of a left Haar system on *G*, but different choices lead to isomorphic C\*-algebras. We start from *C*<sup>∞</sup> *<sup>c</sup>* (*G*), on which we define a convolution product and an involution by

$$f \ast \mathbf{g}(\mathbf{x}) = \int\_{G^{\mathbf{x}(\mathbf{x})}} d\mu^{\mathbf{s}(\mathbf{x})}(\mathbf{y}) f(\mathbf{x}\mathbf{y}) \mathbf{g}(\mathbf{y}^{-1});\tag{\mathbf{C.473}}$$

$$f^\*(\mathbf{x}) = \overline{f(\mathbf{x}^{-1})}.\tag{\mathbf{C.474}}$$

We then define a C\*-algebra *C*∗(*G*) as the completion of *C*<sup>∞</sup> *<sup>c</sup>* (*G*) in the norm

$$||f|| = \sup\{||\mathfrak{A}(f)||\},\tag{C.475}$$

where the supremum is over all Hilbert space representations of *C*<sup>∞</sup> *<sup>c</sup>* (*G*) that satisfy

$$\|\mathfrak{a}(f)\| \le \|f\|\_1 \equiv \max\{\|f\|\_1^{(s)}, \|f\|\_1^{(t)}\},\tag{C.476}$$

where the canonical *L*1-norm on the right-hand side is defined by

$$\|f\|\_{1}^{(\mathbf{y})} = \sup\_{\mathbf{u}\in\mathcal{M}} \int\_{G\_{\mathbf{u}}} d\mu\_{\mathbf{u}}(\mathbf{y}) |f(\mathbf{y})|; \quad \|f\|\_{1}^{(\mathbf{t})} = \sup\_{\mathbf{u}\in\mathcal{M}} \int\_{G^{\mathbf{u}}} d\mu^{\mathbf{u}}(\mathbf{y}) |f(\mathbf{y})|. \tag{\text{C.477}}$$

A more tractable possibility is to limit these representations to a selected class, such as the following one. Further to the *t*-fiber (C.463), we denote the *s*-fibers of *G* by

732 C Operator algebras

$$G\_{\mathfrak{u}} = \mathfrak{s}^{-1}(\mathfrak{u}),\tag{\text{C.478}}$$

which carries a canonical measure

$$d\mu\_{\mathfrak{u}}(\mathfrak{x}) = d\mu^{\mathfrak{u}}(\mathfrak{x}^{-1}).\tag{\text{C.479}}$$

This leads to Hilbert spaces

$$H\_{\mathfrak{u}} = L^2(G\_{\mathfrak{u}}, \mu\_{\mathfrak{u}}),\tag{C.480}$$

on which *C*<sup>∞</sup> *<sup>c</sup>* (*G*) can be represented through the formula

$$\pi\_{\mathfrak{u}}(f)\Psi(\mathbf{x}) = \int\_{G^{\mathfrak{u}}} d\mu^{\mathfrak{u}}(\mathbf{y}) f(\mathbf{x}\mathbf{y}) \Psi(\mathbf{y}^{-1}) \text{ (}\boldsymbol{\Psi} \in H\_{\mathfrak{u}}, \mathbf{x} \in G\_{\mathfrak{u}}, \mathbf{y} \in G^{\mathfrak{u}}\text{)}.\tag{C.481}$$

Such representations automatically satisfy the bound (C.476); restricting the representations π in (C.475) to these π*u*, *u* ∈ *M*, gives the *reduced groupoid C\*-algebra C*∗ *<sup>r</sup>* (*G*). In other words, *C*<sup>∗</sup> *<sup>r</sup>* (*G*) is the completion of *C*<sup>∞</sup> *<sup>c</sup>* (*G*) in the norm

$$||f||\_r = \sup\{||\mathfrak{A}\_u(f)||, u \in M\}.\tag{C.482}$$

One often has *C*∗ *<sup>r</sup>* (*G*) = *C*∗(*G*), but if *G* is for example a non-compact and semisimple Lie group, then the two differ (in which case *C*∗ *<sup>r</sup>* (*G*) is a quotient of *C*∗(*G*)). Deferring groups to the next section, the other examples on our list are as follows.

1. For a space *G* = *M*, the algebraic operations are

$$f \ast \mathbf{g}(\mathbf{x}) = \underline{f(\mathbf{x})\mathbf{g}(\mathbf{x})};\tag{\mathbf{C.483}}$$

$$f^\*(\mathbf{x}) = f(\mathbf{x}),\tag{\text{C.484}}$$

from which we obtain

$$\mathbf{C}\_r^\*(M) = \mathbf{C}\_0(M). \tag{\text{C.485}}$$

Indeed, *Gx* <sup>=</sup> {*x*}, so with <sup>μ</sup>(*x*) = 1 for each *<sup>x</sup>* <sup>∈</sup> *<sup>M</sup>*, we obtain

$$H\_{\mathfrak{x}} = \mathbb{C};\tag{\mathbb{C}.486}$$

$$
\mathfrak{a}\_{\mathfrak{x}}(f) = f(\mathfrak{x}), \tag{C.487}
$$

and hence *<sup>f</sup> <sup>r</sup>* <sup>=</sup> *<sup>f</sup>* ∞; the completion of *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*M*) in this norm is *C*0(*M*).

2. A pair groupoid *<sup>G</sup>* <sup>=</sup> *<sup>M</sup>* <sup>×</sup> *<sup>M</sup>*, with left Haar system <sup>μ</sup>*<sup>u</sup>* <sup>=</sup> <sup>μ</sup> for all *<sup>u</sup>* <sup>∈</sup> *<sup>M</sup>*, gives

$$f \ast \mathbf{g}(\mu, \nu) = \int\_M d\mu(\nu) f(\mu, \nu) \mathbf{g}(\nu, \nu);\tag{\text{C.488}}$$

$$f^\*(\mu, \nu) = \overline{f(\nu, \mu)},\tag{C.489}$$

which of course is reminiscent of the corresponding operations on matrices. Also,

$$H\_u = L^2(M, \mu);\tag{C.490}$$

$$
\pi\_{\mathfrak{u}}(f)\Psi(\nu) = \int\_{M} d\mu(\nu) \, f(\nu, \nu)\, \Psi(\nu), \tag{C.491}
$$

where we wrote *x* = (*v*,*u*) and *y* = (*u*,*w*), and identified ψ(*v*) with ψ(*v*,*u*). With this identification, the representations π*<sup>u</sup>* are the same for each *u*. Using the fact that *C*<sup>∞</sup> *<sup>c</sup>* (*<sup>M</sup>* <sup>×</sup> *<sup>M</sup>*) is dense in *<sup>L</sup>*2(*<sup>M</sup>* <sup>×</sup> *<sup>M</sup>*) and that integral operators (C.491) of Hilbert–Schmidt type are dense in the compact operators, we obtain

$$\mathcal{C}\_r^\*(M \times M) \cong \mathcal{B}\_0(L^2(M)).\tag{C.492}$$

3. For a tangent bundle *G* = *TM*, we have, identifying *TuM* with R*n*, *n* = dim(*M*),

$$f \ast \mathbf{g}(\mu, \nu) = \int\_{\mathbb{R}^{\mu}} d^{\text{\textquotedblleft}} w \, f(\mu, \nu + \nu) \mathbf{g}(\mu, -\nu); \tag{C.493}$$

$$f^\*(\mu, \nu) = \overline{f(\nu, -\mu)},\tag{\text{C.494}}$$

where we used local coordinates (*u*, *v*) on *TM*. Furthermore, we have

$$H\_{\mu} = L^{2}(T\_{\mu}M) = L^{2}(\mathbb{R}^{n});\tag{C.495}$$

$$
\pi\_{\mathfrak{u}}(f)\Psi(\nu) = \int\_{\mathbb{R}^d} d^\nu \mathfrak{w} f(\mathfrak{u}, \nu + \mathfrak{w}) \Psi(-\mathfrak{w}), \tag{\text{C.496}}
$$

which is diagonalized by a Fourier transform *<sup>f</sup>* → <sup>ˆ</sup>*<sup>f</sup>* (cf. Theorem C.109), with

$$
\hat{f}(\boldsymbol{\mu}, p) = \int\_{\mathbb{R}^n} d^n \boldsymbol{\upsilon} f(\boldsymbol{\mu}, \boldsymbol{\upsilon}) e^{i p \boldsymbol{\upsilon}}.\tag{C.497}
$$

This map therefore gives an isomorphism

$$\mathbf{C}^\*(TM) \cong \mathbf{C}\_0(T^\*M). \tag{\text{C.498}}$$

4. The (reduced) C\*-algebra of an action groupoid *G* = Γ *M* has operations

$$f \ast \mathbf{g}(\boldsymbol{\gamma}, \boldsymbol{\mu}) = \underbrace{\int\_{G} d\delta \, f(\boldsymbol{\gamma}\delta, \boldsymbol{\mu}) g(\boldsymbol{\delta}^{-1}, \boldsymbol{\delta}^{-1}\boldsymbol{\gamma}^{-1}\boldsymbol{\mu})}\_{} ; \tag{C.499}$$

$$f^\*(\boldsymbol{\gamma}, \boldsymbol{u}) = \overline{f(\boldsymbol{\gamma}^{-1}, \boldsymbol{\gamma}^{-1}\boldsymbol{u})},\tag{\text{C.500}}$$

and the special representations π*<sup>u</sup>* are given by

$$H\_{\mu} = L^{2}(G);\tag{C.501}$$

$$
\pi\_{\mathfrak{u}}(f)\Psi(\mathfrak{y}) = \int\_G d\delta\, f(\mathfrak{y}\delta, \mathfrak{y}u)\Psi(\mathfrak{d}^{-1}).\tag{C.502}
$$

This gives the (reduced) *transformation group C\*-algebra* (see the end of §C.18)

$$\mathbf{C}\_r^\*(\Gamma \ltimes M) = \mathbf{C}\_r^\*(\Gamma, M). \tag{\text{C.503}}$$

5. The C\*-algebra *<sup>C</sup>*∗(*G<sup>T</sup>* ) of a tangent groupoid will be analyzed in §C.19.

#### C.18 Group C\*-algebras and crossed product algebras

It can be shown that in cases 1–3 above we have *C*∗(*G*) = *C*∗ *<sup>r</sup>* (*G*). It is useful to give a more direct and general construction of both *C*∗(*G*) and *C*∗ *<sup>r</sup>* (*G*) in the case where *G* is a group or an action groupoid; although the former is a special case of the latter by taking the trivial *G*-action on a point, we treat the group case separately first.

Let *G* be a Lie group, or, more generally, a locally compact group, which for simplicity we assume to be unimodular (so that it has a left Haar measure *dx* that is also right invariant). We turn *C*<sup>∞</sup> *<sup>c</sup>* (*G*), or, more generally, *Cc*(*G*), into an algebra with involution by specializing (C.473) - (C.474) to groups, i.e. (changing *<sup>y</sup>* → *<sup>x</sup>*−1*y*),

$$f \ast \mathbf{g}(\mathbf{x}) = \int\_G d\mathbf{y} f(\mathbf{y}) \mathbf{g}(\mathbf{y}^{-1} \mathbf{x});\tag{\text{C.S04}}$$

$$f^\*(\mathbf{x}) = \overline{f(\mathbf{x}^{-1})}.\tag{\text{C.505}}$$

Any unitary representation *u* of *G* on a Hilbert space *H* (assumed strongly continuous, as always) then gives rise to a representation *u* of this ∗-algebra by

$$
\mu^{\int}(f) = \int\_G d\mathbf{y} f(\mathbf{x}) \mu(\mathbf{x}),\tag{C.506}
$$

in that *u* (*f* ∗ *g*) = *u* (*f*)*u* (*g*) and *u* (*f* ∗) = *u* (*f*)∗. Let *f* = sup{*u* (*f*)}, (C.507)

where the supremum is over all continuous unitary representations of *G*.

Definition C.119. *The* group C\*-algebra *C*∗(*G*) *of G is the closure of C*<sup>∞</sup> *<sup>c</sup>* (*G*) *or Cc*(*G*) *in the norm* (C.507)*. The* reduced group C\*-algebra *C*<sup>∗</sup> *<sup>r</sup>* (*G*) *of G is the closure of C*<sup>∞</sup> *<sup>c</sup>* (*G*) *or Cc*(*G*) *in the norm*

$$\|f\|\_{r} = \|\mu\_L^{\int}(f)\|,\tag{C.508}$$

*where uL is the left-regular representation uL*(*G*) *on H* = *L*2(*G*)*, cf.* (7.52)*.*

The relationship between the two group C\*-algebras is given by

$$\mathcal{C}\_r^\*(G) \cong \mu\_L^\int (\mathcal{C}^\*(G)) \cong \mathcal{C}^\*(G) / \ker \left( \mu\_L^\int \right) . \tag{C.509}$$

Definition C.120. *A unitary representation u*<sup>1</sup> *is* weakly contained *in u*2*, if u* <sup>1</sup> (*f*) ≤ *u* <sup>2</sup> (*f*) *for all f* ∈ *Cc*(*G*)*. If every unitary representation of G is weakly contained in uL, and hence* ker *u L* = {0} *and C*<sup>∗</sup> *<sup>r</sup>* (*G*) ∼= *C*∗(*G*)*, we call G* amenable*.*

It can be shown that *G* is amenable iff the commutative C\*-algebra *Cb*(*G*) of bounded continuous functions on *G* with sup-norm has a left-invariant state ω, i.e.,

$$\mathfrak{a}(L\_{\mathfrak{Y}}f) = \mathfrak{a}(f) \text{ ( $\mathfrak{y} \in G, f \in C\_b(G)). $ }\tag{C.510}$$

Here *Ly f*(*x*) = *f*(*y*−1*x*) as usual. This is the case, for example, for all compact groups, all abelian groups, and all solvable groups (and semi-direct products thereof, like the Euclidean group). Non-compact semi-simple Lie groups, like *SLn*(R), or the Lorentz group, are not amenable, similarly for e.g. the Poincare group. ´

Bij construction, there is a bijective correspondence *u* ↔ *u* between unitary representation of *G* and non-degenerate representations of *C*∗(*G*) (which restricts to a bijection between unitary representation of *G* that are weakly contained in *uL* and non-degenerate representations of *C*∗ *<sup>r</sup>* (*G*)). In one direction, this is given by (C.506), whilst in the other, one first decomposes *u* ≡ ρ as a direct sum of cyclic representations with cyclic vectors Ω*i*, and then, for each Ω*<sup>i</sup>* in the sum, puts

$$
\mu(x)\mathfrak{p}(f)\mathfrak{Q}\_i = \mathfrak{p}(L\_xf)\mathfrak{Q}\_i. \tag{C.511}
$$

Now take any C\*-algebra *A* on which *G* acts, in that there is a continuous group homomorphism α : *G* → Aut(*G*), i.e., for each *x* ∈ *G* we have an invertible homomorphism α*<sup>x</sup>* : *A* → *A* such that α*<sup>x</sup>* ◦ α*<sup>y</sup>* = α*xy* and α*<sup>e</sup>* = id*<sup>A</sup>* (or, equivalently, <sup>α</sup>−<sup>1</sup> *<sup>x</sup>* <sup>=</sup> <sup>α</sup>*x*−<sup>1</sup> ), and for each *<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*, the function *<sup>x</sup>* → <sup>α</sup>*x*(*A*) from *<sup>G</sup>* to *<sup>A</sup>* is continuous. We turn the space *Cc*(*G*,*A*) into a <sup>∗</sup>-algebra by generalizing (C.504) - (C.505) to

$$f \ast g(\mathbf{x}) = \int\_G d\mathbf{y} f(\mathbf{y}) \mathbf{a}\_{\mathbf{y}}(g(\mathbf{y}^{-1}\mathbf{x}));\tag{\text{C.S12}}$$

$$f^\*(\mathbf{x}) = \mathfrak{a}\_{\mathbf{x}}(f(\mathbf{x}^{-1})^\*). \tag{\text{C.S13}}$$

We construct representations of *Cc*(*G*,*A*) as a <sup>∗</sup>-algebra from pairs (*u*(*G*),π(*A*)), where *u* is a unitary representation of *G*, and π is a representation of *A* (both defined on the same Hilbert space *H*) that satisfy the *covariance condition*

$$
\pi(\alpha\_{\mathfrak{x}}(a)) = \mathfrak{u}(\mathfrak{x})\pi(a)\mathfrak{u}(\mathfrak{x})^{\*}.\tag{C.S14}
$$

Writing π *u* for the associated representation of *Cc*(*G*,*A*), we put

$$
\pi \rtimes \mathfrak{u}^{\int}(f) = \int\_G d\mathbf{x} \,\mathfrak{m}(f(\mathbf{x})) \mathfrak{u}(\mathbf{x}), \tag{\text{C.S.15}}
$$

and define

$$\|f\| = \sup\{\|\mathfrak{x} \rtimes \mathfrak{u}^{\int}(f)\|\},\tag{C.516}$$

where the supremum runs over all pairs (*u*(*G*),π(*A*)) satisfying (C.515). The closure *C*∗(*G*,*A*,α) of *Cc*(*G*,*A*) in this norm is a C\*-algebra called the *crossed product* or *covariance algebra* defined by *G*, *A*, and α. Once again, by construction there is a bijective correspondence (*u*,π) ↔ π *u* between pairs (*u*,π) satisfying (C.515) and non-degenerate representations π *u* ≡ ρ of *C*∗(*G*,*A*,α), in one direction given by (C.515), and in the other by

$$
\mu(\mathfrak{x})\mathfrak{p}(f)\mathfrak{Q}\_{i} = \mathfrak{p}(\mathfrak{a}\_{\mathfrak{x}}(L\_{\mathfrak{x}}f))\mathfrak{Q}\_{i};\tag{\mathsf{C.S17}}
$$

$$
\pi(a)\mathfrak{p}(f)\mathfrak{Q}\_i = \mathfrak{p}(af)\mathfrak{Q}\_i. \tag{C.S18}
$$

Here <sup>α</sup>*x*(*Lx <sup>f</sup>*) <sup>∈</sup> *Cc*(*G*,*A*) is the function *<sup>y</sup>* → <sup>α</sup>*x*(*f*(*x*−1*y*)), similarly *a f* <sup>∈</sup> *Cc*(*G*,*A*) is given by *y* → *a f*(*y*), and the cyclic vectors Ω*<sup>i</sup>* are defined as in (C.511).

To construct a reduced crossed product, we take any injective representation π*r*(*A*) on some Hilbert space *K*, and from it construct a new Hilbert space

$$H = L^2(G, K) \cong L^2(G) \otimes K,\tag{C.519}$$

consisting of all measure functions <sup>ψ</sup> : *<sup>G</sup>* <sup>→</sup> *<sup>K</sup>* for which *<sup>G</sup> dx*ψ(*x*)<sup>2</sup> *<sup>K</sup>* < ∞, with

$$
\langle \boldsymbol{\upvarphi}, \boldsymbol{\uppsi} \rangle = \int\_G d\boldsymbol{x} \, \langle \boldsymbol{\upvarphi}(\mathbf{x}), \boldsymbol{\upvarphi}(\mathbf{x}) \rangle\_K \tag{C.520}
$$

as the inner product. This Hilbert space *H* carries a covariant pair (*u*(*G*),π(*A*)), viz.

$$
\mu(\mathbf{y})\Psi(\mathbf{x}) = \Psi(\mathbf{y}^{-1}\mathbf{x});\tag{\text{C.S21}}
$$

$$
\pi(a)\Psi(\mathbf{x}) = \pi\_r(\mathfrak{a}\_{\mathbf{x}^{-1}}(a))\Psi(\mathbf{x}), \tag{\text{C.S22}}
$$

and hence an associated representation π *u* of *Cc*(*G*,*A*) given by (C.515), which by continuity extends to a representation ρ*<sup>r</sup>* of *C*∗(*G*,*A*,α). As in the group case, we define *C*∗ *<sup>r</sup>* (*G*,*A*,α) as the closure of *Cc*(*G*,*A*) in the norm *f <sup>r</sup>* = ρ*r*(*f*), or as

$$C\_r^\*(G, A, \mathcal{a}) = \mathfrak{p}\_r(C^\*(G, A, \mathcal{a})).\tag{C.S23}$$

If *G* is amenable, we once again have *C*∗ *<sup>r</sup>* (*G*,*A*,α) = *C*∗(*G*,*A*,α), as for *C*<sup>∗</sup> *<sup>r</sup>* (*G*).

The main case of interest to us is given by a group action *G Q*, as above, which gives rise to a crossed product *C*∗(*G*,*C*0(*Q*),α) ≡ *C*∗(*G*,*Q*) through the choices

$$A = C\_0(\mathcal{Q});\tag{C.524}$$

$$
\alpha\_{\mathfrak{x}}(\tilde{f}) = L\_{\mathfrak{x}} \tilde{f}, \tag{C.525}
$$

i.e., α*x*( ˜*f*)(*q*) = ˜*f*(*x*−1*q*). The (reduced) crossed product *C*<sup>∗</sup> (*r*) (*G*,*Q*), then, is the same as the (reduced) C\*-algebra of the action groupoid *G Q*. Identifying the spaces *Cc*(*G*×*Q*) and *Cc*(*G*,*Cc*(*Q*)), eqs. (C.512) - (C.513) now become

$$f \ast \mathbf{g}(\mathbf{x}, q) = \underbrace{\int\_{G} d\mathbf{y} f(\mathbf{y}, q) \mathbf{g}(\mathbf{y}^{-1} \mathbf{x}, \mathbf{y}^{-1} q)}\_{=: \mathbf{y}}; \tag{\text{C.S26}}$$

$$f^\*(\mathbf{x}, q) = \overline{f(\mathbf{x}^{-1}, \mathbf{x}^{-1}q)}. \tag{\text{C.S27}}$$

The obvious candidate for a faithful representation of *C*0(*Q*) comes from a measure <sup>ν</sup> on *<sup>Q</sup>* with support *<sup>Q</sup>*, so that we may take *<sup>K</sup>* <sup>=</sup> *<sup>L</sup>*2(*Q*,ν) and <sup>π</sup>*r*( ˜*f*) = *<sup>m</sup>* ˜*<sup>f</sup>* , i..e, <sup>π</sup>*r*( ˜*f*)<sup>ψ</sup> <sup>=</sup> ˜*f*ψ, ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*0(*Q*). Identifying *<sup>L</sup>*2(*G*)⊗*L*2(*Q*) with *<sup>L</sup>*2(*G*×*Q*), this yields

$$
\mu(\mathbf{y})\Psi(\mathbf{x},q) = \Psi(\mathbf{y}^{-1}\mathbf{x},q);\tag{\text{C.528}}
$$

$$
\pi(\tilde{f})\Psi(\mathbf{x},q) = \tilde{f}(\mathbf{x}^{-1}q)\Psi(\mathbf{x},q);\tag{\text{C.529}}
$$

$$
\rho\_r(f)\Psi(\mathbf{x},q) = \int\_G d\mathbf{y}\, f(\mathbf{y}, \mathbf{x}q)\Psi(\mathbf{y}^{-1}\mathbf{x}, q). \tag{\text{C.S30}}
$$

#### C.19 Continuous bundles of C\*-algebras

As shown on Chapter 7, continuous bundles of C\*-algebras form a mathematical bridge between the classical and the quantum worlds, but they also form a beautiful structure in their own right. In what follows, *I* is an arbitrary locally compact Hausdorff space, but in the main text it is a subset of the unit interval [0,1] that always contains 0 as an accumulation point, so one may have e.g. *I* = [0,1] itself, or

$$I = (1/\mathbb{N}) \cup \{0\} \equiv 1/\mathbb{N},\tag{C.531}$$

where N = {1,2,...}). In physics, *I* plays the role of the value set for Planck's constant, but also below we generically write *h*¯ ∈ *I*, if only to avoid notational confusion with *x* ∈ *X* (as *C*0(*X*) will be a typical fiber of the continuous bundles we study).

Definition C.121. *Let I be a locally compact Hausdorff space. A* continuous bundle of C\*-algebras *over I consists of a C\*-algebra A, a collection of C\*-algebras* (*Ah*¯)*h*¯∈*I, and surjective homomorphisms* ϕ*h*¯ : *A* → *Ah*¯ *for each h*¯ ∈ *I , such that:*

*1. The function h*¯ → ϕ*h*¯(*a*)*h*¯ *is in C*0(*I*) *for each a* ∈ *A.*

*2. Writing* ·*h*¯ *for the norm in Ah*¯ *, the norm of any a* ∈ *A is given by*

$$||a|| = \sup\_{\hbar \in I} ||\phi\_{\hbar}(a)||\_{\hbar}.\tag{C.532}$$

*3. For any f* ∈*C*0(*I*) *and a* ∈ *A, there is an element f a* ∈ *A such that for each h*¯ ∈ *I,*

$$
\mathfrak{q}\_{\hbar}(fa) = f(\hbar)\mathfrak{q}\_{\hbar}(a). \tag{C.533}
$$

*A* continuous (cross-) section *of the bundle in question is a map h*¯ → *a*(*h*¯) ∈ *Ah*¯ *, h*¯ ∈ *I, for which there is an a* ∈ *A such that a*(*h*¯) = ϕ*h*¯(*a*) *for each h*¯ ∈ *I.*

Thus *A* may be identified with the space of continuous sections of the bundle: if we do so, the homomorphism ϕ*h*¯ is just the evaluation map at *h*¯. The structure of *A* as a C\*-algebra then corresponds to pointwise operations on sections. The idea is that the family (*Ah*¯)*h*¯∈*<sup>I</sup>* of C\*-algebras is glued together by specifying a topology on the disjoint union '*h*¯∈*IAh*¯ , seen as a fibre bundle over *I*. However, this topology is in fact given rather indirectly, namely via the specification of the space of continuous sections. This is reminiscent of Theorem C.23, which specifies the topology on a locally compact Hausdorff space *X* via the C\*-algebra *C*0(*X*). More generally (the previous case being the trivial vector bundle *E* = *X* ×C), the Serre–Swan Theorem about fiber bundles allows one to reconstruct the topology on a locally trivial vector bundle *E* <sup>π</sup> → *X* from the (finitely generated projective) *C*0(*X*)-module *C*0(*X*,*E*) of continuous sections of *E*. As in Definition C.121, one has maps ϕ*<sup>x</sup>* : *C*0(*X*,*E*) → *Ex* given by evaluation at *x*, so that (C.533) holds. However, *continuous bundles of C\*-algebras need not be locally trivial*; for us, this is even the whole point!

Another way of looking at continuous bundles of C\*-algebras starts from a *nondegenerate* homomorphism ϕ from *C*0(*I*) to the center *Z*(*M*(*A*)) of the multiplier algebra *M*(*A*) of *A* (see §C.10); we simply write *f a* for ϕ(*f*)*a*, and similarly *C*0(*I*)*A*. In this notation, nondegeneracy means that *C*0(*I*)*A* is dense in *A*. Given such a nondegenerate homomorphism ϕ : *C*0(*I*) → *Z*(*M*(*A*)), one may define fiber algebras by

$$A\_{\hbar} = A / (C\_0(I; \hbar) \cdot A);\tag{C.534}$$

$$\mathcal{C}\_0(I; \hbar) = \{ f \in \mathcal{C}\_0(I) \mid f(\hbar) = 0 \}; \tag{\text{C.535}}$$

since *C*0(*I*;*h*¯)· *A* is an ideal in *A*, the quotient *Ah*¯ is a C\*-algebra. The projections ϕ*h*¯ : *A* → *Ah*¯ are then given by the corresponding quotient maps sending *a* ∈ *A* to its equivalence class in *Ah*¯ . In general, the function *h*¯ → ϕ*h*¯(*a*)*h*¯ is merely upper semicontinuous, so that one only obtains a structure equivalent to the one described in Definition C.121 if one explicitly requires the above function to be in *C*0(*I*), in which case clause 2 of Definition C.121 follows, too.

It is easy to find "trivial" examples of continuous bundles of C\*-algebras: fix some C\*-algebra *B* and take *A* = *C*0(*I*,*B*) with pointwise operations. In that case, *Ah*¯ = *B* for each *h*¯ ∈ *I*, and the map ϕ*h*¯ : *A* → *B* is given by ϕ*h*¯(*a*) = *a*(*h*¯).

It is not so easy to find nontrivial examples, even with isomorphic fibers (these were first given by Dixmier and Douady, who took the fiber algebras to be the compact operators *B*0(*H*)). To connect classical to quantum, we need bundles over *I* ⊆ [0,1] as described above, with non-isomorphic fibers, of which the fiber *A*<sup>0</sup> above *h*¯ = 0 is isomorphic to *C*0(*X*) for some (locally compact) phase space *X*, and hence is commutative, whereas all other fibers are noncommutative. One might say that it is the job of (deformation) quantization theory to construct such fields. Without proof, we now describe the main class of examples relevant to physics.

As we have seen, each Lie groupoid *G* canonically defines an associated C\* algebra*C*∗ *<sup>r</sup>* (*G*), in which*C*<sup>∞</sup> *<sup>c</sup>* functions on *G* endowed with a generalized convolution product (C.473) and involution (C.474) form a dense subspace. In particular,

$$\mathbf{C}\_r^\*(TM) \cong \mathbf{C}\_0(T^\*M);\tag{\text{C.536}}$$

$$\mathcal{C}\_r^\*(M \times M) \cong \mathcal{B}\_0(L^2(M)),\tag{C.537}$$

where *M* is a manifold (without boundary) with tangent bundle *TM* and cotangent bundle *T*∗*M*. More generally, for any given Lie groupoid *G* one may define

$$A\_0 = \mathcal{C}\_r^\*(N\_M G) \ (\hbar = 0);\tag{C.538}$$

$$A\_{\hbar} = \mathbb{C}\_r^\*(G) \ (\hbar > 0),\tag{C.539}$$

where *NMG* is the normal bundle to the embedding *M* → *G*, cf. (C.444). Now consider the tangent groupoid *G<sup>T</sup>* , which is a bundle over [0,1] with fibers

$$G\_0^T = N\_M G \ (\hbar = 0);\tag{C.540}$$

$$G\_{\hbar}^{T} = G\_{\ }(\hbar > 0),\tag{C.541}$$

The interplay between the differential geometry of the tangent groupoid and the notion of (reduced) Lie groupoid C\*-algebras is described by the following lemma. Lemma C.122. *The map C*<sup>∞</sup> *<sup>c</sup>* (*GT* ) <sup>→</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*G<sup>T</sup> <sup>h</sup>*¯ ) *that restricts f to G<sup>T</sup> <sup>h</sup>*¯ <sup>⊂</sup> *GT continuously extends to a surjective homomorphism* ϕ*h*¯ : *C*<sup>∗</sup> *<sup>r</sup>* (*GT* ) <sup>→</sup> *<sup>C</sup>*<sup>∗</sup> *<sup>r</sup>* (*GT <sup>h</sup>*¯ )*, h*¯ ∈ [0,1]*.*

Various special cases and this lemma ultimately led to the key result of the 1990s:

Theorem C.123. *For any Lie groupoid G, the fibers* (C.538) *-* (C.539) *merge into a continuous bundle of C\*-algebras over I* = [0,1] *with total algebra A* = *C*∗ *<sup>r</sup>* (*GT* ) *and homomorphisms* ϕ*h*¯ : *A* → *Ah*¯ *as described in Lemma C.122.*

The same result holds for the full groupoid C\*-algebras *C*∗(*GT* ) and *C*∗(*GT h*¯ ).

For the pair groupoid *<sup>G</sup>* <sup>=</sup> <sup>R</sup>*<sup>n</sup>* <sup>×</sup>R*n*, as in the argument (C.469) - (C.472) we take some <sup>ˇ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*T*R*n*), seen as a function ˜*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*∞([0,1]×*T*R*n*) that is independent of *<sup>h</sup>*¯. This yields a function ˜*<sup>f</sup>* ◦ <sup>φ</sup>−<sup>1</sup> <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*GT* ), and by construction,

$$
\tilde{f} \circ \phi^{-1}(0, \mathbf{x}, \nu) = \check{f}(\mathbf{x}, \nu); \tag{\text{C.542}}
$$

$$f \circ \phi^{-1}(\hbar, \mathbf{x}, \mathbf{v}) = f\left(\frac{\mathbf{x} + \mathbf{y}}{2}, \frac{\mathbf{y} - \mathbf{x}}{\hbar}\right) \,\,(\hbar > 0). \tag{C.543}$$

By lemma C.122, the function <sup>ϕ</sup>0( ˜*<sup>f</sup>* ◦ <sup>φ</sup>−1) is an element of

$$A\_0 = \mathcal{C}\_r^\*(T\mathbb{R}^n);\tag{C.544}$$

this element is just the function <sup>ˇ</sup>*<sup>f</sup>* . For *<sup>h</sup>*¯ <sup>&</sup>gt; 0, we see <sup>ϕ</sup>*h*¯( ˜*<sup>f</sup>* ◦ <sup>φ</sup>−1) as an element of

$$A\_{\hbar} \cong B\_0(L^2(\mathbb{R}^n)),\tag{C.545}$$

through (C.490) - (C.491). Calling this element *QW <sup>h</sup>*¯ ( <sup>ˇ</sup>*f*), we have

$$\mathcal{Q}^{W}\_{\hbar}(\check{f})\Psi(\mathbf{x}) = \hbar^{-n} \int\_{\mathbb{R}^{n}} d^{n}\mathbf{y} \, f\left(\frac{\mathbf{x}+\mathbf{y}}{2}, \frac{\mathbf{y}-\mathbf{x}}{\hbar}\right) \Psi(\mathbf{y}).\tag{C.546}$$

We now use the isomorphism (C.536), implemented through the Fourier transform

$$f(\mathbf{x}, p) = \int\_{\mathbb{R}^n} d^n \nu \check{f}(\mathbf{x}, \nu) e^{ip\nu};\tag{\text{C.S47}}$$

$$\check{f}(\mathbf{x}, \nu) = \int\_{\mathbb{R}^n} \frac{d^n p}{(2\pi)^n} f(\mathbf{x}, p) e^{-ip\nu}. \tag{C.548}$$

Hence as an element of *<sup>C</sup>*0(*T*∗R*n*), the operator <sup>ϕ</sup>0( ˜*<sup>f</sup>* ◦φ−1) is *<sup>f</sup>* . From this perspective, using (C.548), eq. (C.546) may be rewritten in the more familiar form

$$Q^W\_\hbar(f)\Psi(\mathbf{x}) = \int\_{T^\*\mathbb{R}^n} \frac{d^n p d^n \mathbf{y}}{(2\pi\hbar)^n} e^{i p(\mathbf{x}-\mathbf{y})/\hbar} \Psi(\mathbf{y}) f(\frac{1}{2}(\mathbf{x}+\mathbf{y}), p). \tag{C.549}$$

It follows that any <sup>ˇ</sup>*<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*<sup>∞</sup> *<sup>c</sup>* (*T*R*n*) defines a continuous cross-section of the continuous bundle of C\*-algebra defined by *<sup>A</sup>* <sup>=</sup> *<sup>C</sup>*∗((R*<sup>n</sup>* <sup>×</sup>R*n*)*<sup>T</sup>* ), given by (C.547), and

$$0 \mapsto f \in \mathcal{C}\_0(T^\*\mathbb{R}^n);\tag{\text{C.550}}$$

$$
\hbar \mapsto \mathcal{Q}\_{\hbar}^{W}(f) \in \mathcal{B}\_{0}(L^{2}(\mathbb{R}^{n})).\tag{C.551}
$$

See also §7.1. These formulae were written down for the special case *<sup>M</sup>* <sup>=</sup> <sup>R</sup>*n*, but similar results (based on the exponential map as defined in Riemannian geometry) apply to any manifold. Moreover, as explained in §§7.2–7.4, Mackey's theory of quantization based on systems of imprimitivity and induced group representations falls squarely under the above umbrella, where *G* is an action groupoid.

We also employ continuous bundles of C\*-algebras with non-isomorphic fibers even away from *h*¯ = 0. The construction of these fields relies on the following result, which is a special case of a more general claim; we just state the case we need, in which *I* = 1/N˙ ; continuity then imposes conditions at *h*¯ = 0 only (as *I* is discrete elsewhere). We identify the total space *A* of a (continuous) bundle of C\*-algebras with the space of its (continuous) sections, as explained at the beginning of this section; thus *a* ∈ *A* ⊂ ∏*h*¯ *Ah*¯ takes the form *a* = {*ah*¯ }*h*¯∈*I*, *ah*¯ ∈ *Ah*¯ .

Proposition C.124. *Suppose one has a family* {*Ah*¯ }*h*¯∈*<sup>I</sup> of C\*-algebras over I* <sup>=</sup> <sup>1</sup>/N˙ *, as well as a subset <sup>A</sup>*˜ <sup>⊂</sup> <sup>∏</sup>*h*¯ *Ah*¯ *that satisfies the following conditions:*


*Let A consist of all a* ∈ ∏*h*¯ *Ah*¯ *for which one has*

$$\lim\_{N \to \infty} \|a\_{1/N} - \tilde{a}\_{1/N}\| = \|a\_0 - \tilde{a}\_0\| \ (\tilde{a} \in \tilde{A}).\tag{C.552}$$

*Regard A as a C\*-algebra under pointwise operations and norm* (C.532)*, and define*

$$
\mathfrak{g}\_{\hbar}(a) = a\_{\hbar}. \tag{C.553}
$$

*Then* (*A*,{*Ah*¯ ,ϕ*h*¯ }*h*¯∈*I*) *is a continuous bundle of C\*-algebras (and is the unique such bundle whose space of sections contains A).* ˜

The proof relies on the following lemma (which we state for general compact *I*).

Lemma C.125. *The total C\*-algebra A of (sections of) a continuous bundle of C\* algebras is* locally uniformly closed*. That is, if a* ∈ ∏*h*¯ *Ah*¯ *is such that for every <sup>h</sup>*¯ <sup>0</sup> <sup>∈</sup> *I and every* <sup>ε</sup> <sup>&</sup>gt; <sup>0</sup>*, there exists bh*¯ <sup>0</sup> <sup>∈</sup> *A and a neighborhood* <sup>N</sup> *of <sup>h</sup>*¯ <sup>0</sup> *in which ah*¯ −*b h*¯ 0 *<sup>h</sup>*¯ < ε *for all h*¯ ∈ N *, then a* ∈ *A.*

*Equivalently, if A (etc.) is a continuous bundle of C\*-algebras, and a* ∈ ∏*h*¯ *Ah*¯ *is such that the function h*¯ → *ah*¯ −*bh*¯ *lies in C*(*I*) *for each b* ∈ *A, then a* ∈ *A.*

*Proof.* Since *I* is compact, it has a finite cover {*U*1,...,*Un*} with associated partition of unity {*ui*}. With *<sup>a</sup>* and <sup>ε</sup> as in the lemma, take *<sup>h</sup>*¯*<sup>i</sup>* <sup>∈</sup>*Ui* and *bh*¯*<sup>i</sup>* also as in the lemma, and define *<sup>b</sup>* <sup>=</sup> <sup>∑</sup>*<sup>i</sup> uibh*¯*<sup>i</sup>* . Then *<sup>b</sup>* satisfies sup*h*¯∈*<sup>I</sup> ah*¯ <sup>−</sup>*bh*¯ <sup>&</sup>lt; <sup>ε</sup>, and also *<sup>b</sup>* <sup>∈</sup> *<sup>A</sup>*, because of Definition C.121.3. Hence *a* ∈ *A* by Definition C.121.2 and completeness of *A*.

As to the equivalent version, given *a* ∈ ∏*h*¯ *Ah*¯ and *h*¯ <sup>0</sup> ∈ *I*, because ϕ*h*¯ is surjective, there is a *<sup>b</sup>h*¯ <sup>0</sup> <sup>∈</sup> *<sup>A</sup>* such that *ah*¯ <sup>0</sup> <sup>=</sup> *<sup>b</sup> h*¯ 0 *h*¯ 0 . The assumption in the second part then implies that the conditions in the first part are satisfied, such that *a* ∈ *A*. -

We are now in a position to prove Proposition C.124.

*Proof.* We first show that *A* as defined in the proposition is locally uniformly closed. With the notation of Lemma C.125 and its proof, take ˜*<sup>a</sup>* <sup>∈</sup> *<sup>A</sup>*˜, and define the functions

$$f\_{a\vec{a}} : \hbar \mapsto ||a\_{\hbar} - \tilde{a}\_{\hbar}||;\tag{C.554}$$

$$f\_{b\vec{a}} : \hbar \mapsto ||b\_{\hbar}^{\hbar\_0} - \tilde{a}\_{\hbar}||. \tag{C.555}$$

Since |(*X*−*Y*)|≤*X* −*Y*, one obtains

$$|f\_{a\vec{a}}(\hbar) - f\_{b\vec{a}}(\hbar)| < \varepsilon,\tag{C.556}$$

for all *h*¯ ∈ *I*. By assumption, *fba*˜ is continuous, so that

$$|f\_{b\vec{d}}(\hbar) - f\_{b\vec{d}}(\hbar\_0)| < \varepsilon,\tag{C.557}$$

for all *h*¯ in some neighborhood *U* of *h*¯ 0. Combining the two inequalities yields

$$|f\_{a\vec{a}}(\hbar) - f\_{a\vec{a}}(\hbar\_0)| < \mathfrak{B}\varepsilon,\tag{C.558}$$

for all *h*¯ ∈ *U* . Hence *faa*˜ is continuous at any *h*¯ <sup>0</sup> ∈ *I*, so that *a* ∈ *A* by Lemma C.125.

Using this property, it is easily shown that *A* is a C\*-algebra, and that condition 3 in Definition C.121 is satisfied. It is clear from Definition C.121.1 and the definition of *A* in the proposition that *A* is maximal. On the other hand, according to the second part of Lemma C.125, *A* is minimal, so that it is unique. -

To close, let us explain to what extent we can say that a given section (*a*1/*N*)*<sup>N</sup>* of either one of our continuous bundles *A*(*c*) or *A*(*q*) "converges" to its value *a*0.

Proposition C.126. *Let* (*a*0,*a*1/*N*) *and* (*a* 0,*a* <sup>1</sup>/*N*) *be continuous cross-sections of some continuous bundle A of C\*-algebras over I* = 1/N˙ *, such that*

$$\lim\_{N \to \infty} \|a\_{1/N}' - a\_{1/N}\| = 0. \tag{C.559}$$

*Then a* <sup>0</sup> = *a*0*. In particular, if* (*a*0,*a*1/*N*) *is a continuous cross-section, then a*<sup>0</sup> *is uniquely determined by the* (*a*1/*N*) *and we may symbolically write*

$$a\_0 = \lim\_{N \to \infty} a\_{1/N}.\tag{C.560}$$

*Proof.* The last part of Lemma C.125 states that the function defined by

$$\begin{aligned} \mathbf{0} &\mapsto ||a\_0 - a\_0'|| ; \\ 1/N &\mapsto ||a\_{1/N} - a\_{1/N}'|| , \end{aligned}$$

is continuous on 1/N˙ (i.e., continuous at 0). -

#### C.20 von Neumann algebras and the σ-weak topology

In this section and in §C.24 we turn to special classes of C\*-algebras that are occasionally used in quantum (field) theory. Since the arguments tend to become very lengthy and technical, we will only prove some key results (e.g. von Neumann's Double Commutant Theorem), and mention other results without proof (references to which may be found in the Notes). This also applies to the next four sections.

The subject of operator algebras historically started with what we now call *von Neumann algebras*, in honour of the founder of the subject (although, curiously, C\*-algebras are not called "Gelfand–Naimark algebras"; perhaps they should!).

The first result in operator algebras was and is the *Double Commutant Theorem*:

Theorem C.127. *Let M be a unital* ∗*-subalgebra of B*(*H*)*. Then the following conditions are equivalent—and, if satisfied, define M to be a* von Neumann algebra*:*

*(i) M* = *M; (ii) M is closed in the weak operator topology; (iii) M is closed in the strong operator topology.*

Recall that the *commutant S* of any *S* ⊂ *B*(*H*) is defined by

$$S' = \{ a \in B(H) \mid ab = ba \forall b \in \mathcal{S} \},\tag{C.561}$$

and that the *bicommutant* of *S* is *S* = (*S* ) . If *S*<sup>∗</sup> = *S*, in that *a* ∈ *S* iff *a*<sup>∗</sup> ∈ *S*, then *S* is easily seen to be a unital ∗-algebra within *B*(*H*). Furthermore, it is obvious that *S* ⊆ *S*, so that the passage *S* → *S* is some sort of a closure operation within *B*(*H*), comparable to the closure operation *L* → *L*⊥⊥ within *H* itself. Theorem C.127 shows that if *S* is a unital <sup>∗</sup>-algebra, the algebraic closure operation *S* → *S* coincides with two topological closure operations. To this effect, recall also that:


*Proof.* The essence of the proof is already contained in the finite-dimensional case *H* = C*n*, where the nontrivial claim in Theorem C.127 is:

$$\text{If } \mathcal{M} \text{ is a unital } ^\*\text{-subalgebra of } \mathcal{M}\_n(\mathbb{C}), \text{ then } \mathcal{M}'' = \mathcal{M}.$$

In fact, all we need to prove is *M* ⊆ *M*, since the converse inclusion is obvious. The idea is to take *n* arbitrary (and hence possibly linearly independent) vectors υ1,...,υ*<sup>n</sup>* in *H*, and, given *a* ∈ *M*, find some *b* ∈ *M* such that *a*υ*<sup>i</sup>* = *b*υ*<sup>i</sup>* for all *i* = 1,...,*n*. Hence *a* = *b*, so *a* ∈ *M*. To this end, we start with a single vector υ ∈ *H*.

Form the linear subspace *M*υ = {*m*υ | *m* ∈ *M*} of *H*, with associated projection *e* (i.e. *ew* = *w* if *w* ∈ *M*υ and *ew* = 0 if *w* ∈ (*M*υ)⊥). Then *e* ∈ *M* , and hence *a* ∈ *M* commutes with *e*. Since 1*<sup>H</sup>* ∈ *M*, we have υ ∈ *M*υ, so υ = *e*υ, and we compute *a*υ = *ae*υ = *ea*υ ∈ *M*υ. Hence *a*υ = *b*υ, for some *b* ∈ *M*.

Now run the same argument with the following substitutions:


We then have (*Mn*) = (*M*)*n*, so for any matrix a = diag(*a*,...,*a*) in (*M*)*n*, the previous argument yields a matrix b = diag(*b*,...,*b*) ∈ *Mn* such that aυ = bυ. But this is *a*υ*<sup>i</sup>* = *b*υ*<sup>i</sup>* for all *i* = 1,...,*n*, so that *a* = *b* and hence *M* ⊆ *M*.

If *H* is infinite-dimensional, the above proof may be adapted by taking the closure of *M*υ in *H*, which gives (3) ⇒ (1). Finally, (1) ⇒ (2) ⇒ (3) is trivial. -

Corollary C.128. *Let M be a unital* ∗*-subalgebra of B*(*H*)*. Then the closures of M in the strong and weak topologies coincide with each other and with M.*

Corollary C.129. *A von Neumann algebra is norm-closed, i.e., is a C\*-algebra.*

Since *S* = *S* , the commutant of any self-adjoint set *S*<sup>∗</sup> = *S* ⊂ *B*(*H*) is a von Neumann algebra. As a case in point, take a (strongly continuous) unitary group representation *<sup>u</sup>* : *<sup>G</sup>* <sup>→</sup> *<sup>B</sup>*(*H*). Then *<sup>u</sup>*(*x*)<sup>∗</sup> <sup>=</sup> *<sup>u</sup>*(*x*−1), so *<sup>u</sup>*(*G*) is a von Neumann algebra. In fact, any von Neumann algebra *M* takes this form, since one may take *G* to be the group of all unitaries in *M* (and *u* its defining representation). Furthermore, the bicommutant *A* of any C\*-algebra *A* ⊂ *B*(*H*) is a von Neumann algebra. An important example of this construction is the abelian von Neumann algebra *W*∗(*a*) = *C*∗(*a*) generated by a self-adjoint operator *a* = *a*<sup>∗</sup> ∈ *B*(*H*), cf. (B.320).

Although the weak and strong topologies on *M* appear in the fundamental double commutant theorem, the most important topology on a von Neumann algebra (besides the norm topology) is the so-called the σ*-weak topology* (sometimes called the *ultraweak topology*). This topology corresponds to the following convergence:

• One has *a*<sup>λ</sup> → *a* σ-weakly iff Tr(*b*(*a*<sup>λ</sup> −*a*)) → 0 for each *b* ∈ *B*1(*H*).

To begin with, as far as Theorem C.127 is concerned this topology is at least on a par with the weak and the strong ones:

Theorem C.130. *Let M be a unital* ∗*-subalgebra of B*(*H*)*. Then M* = *M (i.e. M is a von Neumann algebra) iff M is closed in the* σ*-weak operator topology.*

This one is a bit more technical, so we just sketch the proof.

*Proof.* Define a new Hilbert space *<sup>H</sup>*<sup>∞</sup> <sup>=</sup> *<sup>H</sup>* <sup>⊗</sup>-2, whose elements v are infinite sequences of vectors (υ1,υ2,...) in *<sup>H</sup>* with <sup>∑</sup>*<sup>i</sup>* υ*i*<sup>2</sup> <sup>&</sup>lt; <sup>∞</sup>. The inner product is

$$
\langle \boldsymbol{\psi}, \boldsymbol{\psi}' \rangle\_{H^{\omega}} = \sum\_{i} \langle \boldsymbol{\psi}\_{i}, \boldsymbol{\psi}\_{i}' \rangle\_{H^{\omega}}.\tag{C.562}
$$

The obvious (diagonal) embedding of *B*(*H*) in *B*(*H*∞), whose image is denoted by *<sup>B</sup>*(*H*)∞, restricts to *<sup>M</sup>* <sup>⊂</sup> *<sup>B</sup>*(*H*), with image *<sup>M</sup>*<sup>∞</sup> <sup>⊂</sup> *<sup>B</sup>*(*H*∞). Then the <sup>σ</sup>-weak topology on *B*(*H*) is the relative weak topology on *B*(*H*)<sup>∞</sup> (i.e., the weak topology on *B*(*H*∞) restricted to *B*(*H*)∞), so that Theorem C.130 follows from Theorem C.127. -

This brings us to an important refinement of Theorem C.127, called *Kaplansky's Density Theorem* (which should actually be seen as a lemma for numerous results): Theorem C.131. *Let A* ⊂ *B*(*H*) *be a C\*-algebra (or a* <sup>∗</sup>*-algebra). Then the unit ball of A is dense in the unit ball of A in the weak, strong, and* σ*-weak topologies.*

The real significance of the σ-weak topology comes from *Sakai's Theorem*:

Theorem C.132. *A C\*-algebra M* ⊂ *B*(*H*) *is a von Neumann algebra iff M is the (Banach) dual of a unique Banach space M*<sup>∗</sup> *(called the* predual *of M).*

We turn to the proof below. For example, by Theorem B.146, the predual of *B*(*H*) is

$$B(H)\_\* \cong B\_1(H).\tag{C.563}$$

In the commutative case, entry 10 in Table B.1 in §B.9 gives

$$L^{\curvearrowright}(X,\mu)\_\* \cong L^1(X,\mu);\tag{C.564}$$

the fact that *L*∞(*X*,μ), acting on *H* = *L*2(*X*,μ) as multiplication operators, is a von Neumann algebra was established in §B.16. In the first example, the σ-weak topology on *B*(*H*) obviously coincides with the weak∗-topology defined by *B*(*H*)∗.

In general, there is a canonical embedding *M*<sup>∗</sup> → *M*∗, ϕˇ → ϕ, with ϕ(*a*) = *a*(ϕˇ), cf. §B.9. Proposition B.46 then shows that the image of *M*<sup>∗</sup> in *M*<sup>∗</sup> consists precisely of the weak∗-continuous functionals on *M* (recall that the weak∗-topology on *M* is the topology of pointwise convergence, seeing *M* as the dual of *M*∗). If we now identify ϕˇ with ϕ, we have the following generalization of the observation just made:

Theorem C.133. *Let M* ⊂ *B*(*H*) *be a von Neumann algebra. The predual M*<sup>∗</sup> *of M (seen as a subspace of M*∗*) coincides with the space of* σ*-weakly continuous functionals on M, and hence the* σ*-weak topology on M coincides with the weak*∗ *topology in its role as the dual Banach space of M*∗*.*

σ-weakly continuous functionals on a von Neumann algebra *M* are called *normal*.

*Proof.* Identifying ϕˇ with ϕ, we introduce the following spaces:

*M*<sup>⊥</sup> = {ϕ ∈ *B*(*H*)<sup>∗</sup> | ϕ(*a*) = 0 ∀*a* ∈ *M*}; *M*⊥⊥ = {*a* ∈ *B*(*H*) | ϕ(*a*) = 0 ∀ϕ ∈ *M*⊥}.

Having proved the theorem for *M* = *B*(*H*), i.e., (C.563), the key is to show that

$$M^{\perp \perp} = M;\tag{C.565}$$

$$\mathcal{M}\_\* \cong B(H)\_\*/\mathcal{M}^\perp,\tag{C.566}$$

where (C.566) denotes an isometric isomorphism of normed spaces. Since the righthand side of (C.566) is a Banach space, so is the left-hand side. This yields the first claim. Combining (C.566) with (C.565) and the duality *B*(*H*) = *B*1(*H*)∗, we have

$$M\_\*^\* \cong (B(H)\_\*/M^\perp)^\* = M^{\perp \perp} = M.$$

This is the second claim. The first equality sign is true, because if *Y* is a closed subspace of a Banach space *Y*, then (*X*/*Y*)<sup>∗</sup> = {ϕ ∈ *X*<sup>∗</sup> | ϕ *Y* = 0}.

For the remainder of the theorem, recall that *a*<sup>λ</sup> → *a* σ-weakly in *M* whenever ϕ(*a*<sup>λ</sup> − *a*) → 0 for all ϕ ∈ *B*(*H*)∗. By (C.566), this is equivalent to *a*<sup>λ</sup> → *a* in the weak∗-topology, since a possible component of ϕ in *M*⊥ drops out.

We next prove (C.565). The inclusion *M* ⊂ *M*⊥⊥ is trivial. For the converse, pick *a* ∈/ *M*; since *M* is a von Neumann algebra, it is σ-weakly closed, so its complement *<sup>M</sup><sup>c</sup>* in *<sup>B</sup>*(*H*) is <sup>σ</sup>-weakly open. Hence there are <sup>ϕ</sup> <sup>∈</sup> *<sup>B</sup>*(*H*)<sup>∗</sup> and <sup>ε</sup> <sup>&</sup>gt; 0 such that the open neighbourhood O(*a*) = {*b* ∈ *B*(*H*) : |ϕ(*a*) − ϕ(*b*)| < ε} of *a* entirely lies in *<sup>M</sup>c*. So <sup>|</sup>ϕ(*a*) <sup>−</sup> <sup>ϕ</sup>(*b*)| ≥ <sup>ε</sup> for all *<sup>b</sup>* <sup>∈</sup> *<sup>M</sup>*. This implies <sup>ϕ</sup>(*b*) = 0 by linearity in *<sup>b</sup>*. Hence |ϕ(*a*)| ≥ ε, so *a* ∈/ *M*⊥⊥, hence *M*⊥⊥ ⊂ *M*.

For (C.566), first note that *M*<sup>⊥</sup> is a norm-closed subspace of *B*(*H*)<sup>∗</sup> = *B*1(*H*), which is a Banach space in the trace-norm (which coincides with the norm inherited from *B*(*H*)∗, since the injection *B*1(*H*)→ *B*(*H*)<sup>∗</sup> is an isometry). Hence the quotient *B*(*H*)∗/*M*<sup>⊥</sup> is a Banach space in the canonical norm ϕ˙ = inf{ϕ +ψ | ψ ∈ *M*⊥}, where ϕ˙ is the image of ϕ ∈ *B*(*H*)<sup>∗</sup> under the canonical projection, and the norm is the one in *<sup>B</sup>*(*H*)∗. Let <sup>ϕ</sup> <sup>=</sup> <sup>ϕ</sup> *<sup>M</sup>* be the restriction of <sup>ϕ</sup> <sup>∈</sup> *<sup>B</sup>*(*H*)<sup>∗</sup> to *<sup>M</sup>*. It is clear that the map <sup>ϕ</sup> → <sup>ϕ</sup>˙ is well defined and is a linear bijection from *<sup>M</sup>*<sup>∗</sup> to *<sup>B</sup>*(*H*)∗/*M*⊥. In fact, this map is isometric. First, one trivially has

$$\|\|\boldsymbol{\varrho}^{\uparrow}\|\| = \sup\{ |\boldsymbol{\varrho}(a)| \, | \, a \in M\_{\mathfrak{u}} \} = \inf\_{\boldsymbol{\Psi} \in \boldsymbol{M}^{\perp}} \sup \{ |\boldsymbol{\varrho}(a) + \boldsymbol{\Psi}(a)| \, | \, a \in M\_{\mathfrak{u}} \}, \quad (\text{C.567})$$

since ψ(*a*) = 0. But this is clearly majorized by

$$\|\|\boldsymbol{\Phi}\|\| = \inf\_{\boldsymbol{\Psi} \in \boldsymbol{M}^\perp} \sup \{ |\boldsymbol{\Phi}(\boldsymbol{a}) + \boldsymbol{\Psi}(\boldsymbol{a})|, \boldsymbol{a} \in \mathcal{B}(H)\_1 \}, \tag{C.568}$$

since now the supremum is taken over a larger set. Hence ϕ ≤ϕ˙ .

Conversely, for any ϕ ∈ *B*(*H*)<sup>∗</sup> with ϕ˙ = 1, by Corollary B.41 there exists an *a* ∈ *B*(*H*) with ˇ*a* ∈ *M*⊥⊥, ϕ(*a*) = 1 and *a* = 1. From (C.565), one then has ϕ ≥|ϕ(*a*)| = 1 = ϕ˙ . This finishes the proof of Theorem C.133. -

Half of Theorem C.132 evidently follows from Theorem C.133. The converse ('if') implication uses a refinement of the GNS-construction, where the state ω is assumed to be σ-weakly continuous. In that case, using the theory of σ-weakly closed ideals of von Neumann algebras, it can be shown that πω(*M*) coincides with πω(*M*) and hence is a von Neumann algebra. Since normal pure state on a von Neumann algebra may not exist (for example, take *M* = *L*∞(0,1)), the 'crazy' Hilbert space *Hc* in the proof of Theorem C.87 must be replaced by the perhaps even crazier direct sum *Hec* = 3 <sup>ω</sup>∈*Sn*(*M*) *H*ω, where this time the sum is over all *normal* states on *M*. Similarly, in Lemma C.15 one should now have a normal state instead of a pure state. Otherwise, the proof that *M* has a faithful representation as a von Neumann algebra on a Hilbert space essentially follows the proof of Theorem C.87.

Finally, uniqueness of the predual follows from Corollary C.139 below. -

Corollary C.134. *Let M* ⊂ *B*(*H*) *be a von Neumann algebra. Each normal functional* ϕ ∈ *M*<sup>∗</sup> *on M is of the form* ϕ(*a*) = Tr(*ba*)*, for some b* ∈ *B*1(*H*)*. In particular,* ϕ *is a normal state iff b is a density operator.*

#### C.21 Projections in von Neumann algebras

General C\*-algebras need not have any nontrivial projections; think of *C*0([0,1]). On the other hand, von Neumann algebras are generated by their projections:

Theorem C.135. *Let* <sup>P</sup>(*M*) = {*<sup>p</sup>* <sup>∈</sup> *<sup>M</sup>* <sup>|</sup> *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*}*, where M is a von Neumann algebra. Then M is the norm-closure of the linear span of* P(*M*)*, and M* = P(*M*)*.*

This is Corollary B.105. In addition, P(*M*) is not just a set.

Proposition C.136. *The set* P(*M*) *of projections in a von Neumann algebra M is a complete lattice under the partial ordering e* ≤ *f iff e f* = *f e* = *e.*

*Proof.* Since *e* ≤ *f* in *M* ⊂ *B*(*H*) iff *eH* ⊆ *f H*, the supremum *e*∨ *f* is the projection on *eH* + *f H*, whilst the infimum *e*∧ *f* is the projection on *eH* ∩ *f H*. For arbitrary families (*e*<sup>λ</sup> )λ∈<sup>Λ</sup> of projections, ∨<sup>λ</sup> *e*<sup>λ</sup> equals the projection on the closure of the linear span of all subspaces *H*<sup>λ</sup> ≡ *e*λ*H*, whereas ∧<sup>λ</sup> *e*<sup>λ</sup> ≡ *e* is the projection on their intersection. To show that the latter lies in *M* (provided all the *e*<sup>λ</sup> do, of course), note that each unitary *u* ∈ *M* satisfies *uH*<sup>λ</sup> = *H*<sup>λ</sup> for all λ, so that also *u*(∩λ*H*<sup>λ</sup> ) = ∩λ*H*<sup>λ</sup> . Hence *eu* = *ue* and so *e* ∈ *M* = *M* (since each element of a von Neumann algebra is a linear combination of at most four unitaries in it; the proof is similar to Lemma B.145). Finally, by de Morgan's Law we have ∨<sup>λ</sup> *e*<sup>λ</sup> = (∧<sup>λ</sup> *e*<sup>⊥</sup> <sup>λ</sup> )⊥, with *f* <sup>⊥</sup> = 1− *f* for any *f* ∈ P(*M*). Hence also ∨<sup>λ</sup> *e*<sup>λ</sup> ∈ *M*. -

This is nice in itself, but is also implies a very important result about maps between von Neumann algebras. Recall that a (purely algebraic) isomorphism between C\* algebras (seen as ∗-algebras) is automatically isometric and hence norm-continuous; see Theorem C.62. An even better result holds for von Neumann algebras:

Theorem C.137. *A (purely algebraic) isomorphism* ϕ : *M* → *N between von Neumann algebras (seen as* ∗*-algebras) is an isomorphism of Banach spaces as well as a homeomorphism with respect the* σ*-weak topologies on M and N.*

This theorem only seems to have rather difficult proofs. One, based on Proposition C.136, is based on the following result. First, we say that a map ϕ : *M* → *N* of von Neumann algebras is *completely additive* if for any family (*e*<sup>λ</sup> ) in P(*M*),

$$
\mathfrak{G}(\vee\_{\lambda} e\_{\lambda}) = \vee\_{\lambda} \mathfrak{G}(e\_{\lambda}).\tag{C.569}
$$

Lemma C.138. *Let* ϕ : *M* → *N be a homomorphism of von Neumann algebras.*


The proof of claim 2 is easy, as is the implication from σ-weak continuity to completely additivity in claim 1. The converse implication, however, is quite difficult. In any case, Theorem C.137 now follows, so that we may speak of isomorphisms between von Neumann algebras without any ambiguity.

Corollary C.139. *If two von Neumann algebras are algebraically isomorphic, then their preduals M*<sup>∗</sup> *and N*<sup>∗</sup> *are isomorphic as Banach spaces. In particular (take M* = *N), the predual of a von Neumann algebra is unique (up to isometric isomorphism).*

A second proof of Theorem C.137 uses Theorem C.132 (and hence provides no non-circular proof of Corollary C.139), as follows.

*Proof.* Since ϕ is isometric by Corollary C.129 and Theorem C.62, it induces a dual isomorphism (of Banach spaces) ϕ<sup>∗</sup> : *N*<sup>∗</sup> → *M*∗, with the property that *M* ∼= (ϕ∗(*N*∗))<sup>∗</sup> under the map

$$a \mapsto \left(\mathfrak{\mathfrak{p}}^\*(a) \mapsto \mathfrak{a}(\mathfrak{\mathfrak{q}}(a))\right) \left(a \in M, \mathfrak{a} \in N\_\*\right). \tag{C.S70}$$

Uniqueness of the predual then yields ϕ∗(*N*∗) ∼= *M*∗, which in turn implies that ϕ preserves pointwise convergent nets: if ω (*a*<sup>λ</sup> ) → ω (*a*) for all ω ∈ *M*∗, then ω(ϕ(*a*<sup>λ</sup> )) → ω(ϕ(*a*)) for all ω ∈ *N*∗. Hence ϕ is σ-weakly continuous. -

Theorem C.137 shows that the notion of isomorphism to be used in the classification of von Neumann algebras *M* is unambiguous. There are two totally different cases of von Neumann algebras (only *M* = C falls in both classes):


A factor has no nontrivial decomposition *M* = *M*<sup>1</sup> ⊕ *M*2, whereas an abelian von Neumann algebra (except *M* = C) does have such a decomposition (typically even many of them). Using von Neumann's technique of *direct integrals*, which generalizes direct sums (and will not be reviewed here), the classification of general von Neumann algebras may be reduced to these two cases. We start with the first class.

We know that if (*X*,Σ,μ) is some σ-finite Borel space with associated Hilbert space *L*2(*X*,μ), then the commutative C\*-algebra *L*∞(*X*,μ) is mapped isometrically into *<sup>B</sup>*(*L*2(*X*,μ)) via *<sup>f</sup>* → *mf* , see Proposition B.73 and especially (B.240). If we denote the image of this map by *L*∞(*X*,μ) also, then *L*∞(*X*,μ) = *L*∞(*X*,μ) by (B.346), so *<sup>L</sup>*∞(*X*,μ) <sup>⊂</sup> *<sup>B</sup>*(*L*2(*X*,μ)) is an abelian von Neumann algebra. In general:

Theorem C.140. *Let M* ⊂ *B*(*H*) *be an abelian von Neumann algebra, Then*

$$M \cong L^{\infty}(X, \mu),\tag{C.571}$$

*for some locally compact space X and probability measure* μ *on X.*

If *H* is separable, this follows from Theorems B.116 (including the remarks after its proof) and B.117 in §B.16. The proof for arbitrary Hilbert space is quite technical and will be omitted, but the idea is to find an abelian C\*-algebra *A* for which *M* = *A*, upon which *X* = Σ(*A*), and the measure μ is constructed such that μ(Δ) = 0 iff μψ(Δ) = 0 for all unit vectors ψ ∈ *H*, with μψ defined similarly to (B.304). In general, one cannot take *A* = *M*, since Σ(*M*) may not support such measures. Thus we have a complete and satisfactory characterization of abelian von Neumann algebras, including their projections: these are simply the (equivalence classes of) characteristic functions 1*A*, where *A* ∈ Σ is a Borel set in *X* (modulo null sets).

The advantage of this approach is that there are often simple models for *X*; we know from the classification of maximal abelian von Neumann algebras on separable Hilbert space in §B.17 that *X* = [0,1] with (Lebesgue measure) and *X* = N (with counting measure) are enough in that case. However, the pair (*X*,μ) lacks intrinsic uniqueness properties. Thus it also makes sense to apply Theorem C.8 to abelian von Neumann algebras, so that *M* ∼= *C*(*X*). Since by Theorem C.135, *M* has plenty of projections, which as elements of *C*(*X*) are realized by characteristic functions 1*A*, where *A* ⊂ *X*, the space *X* must have lots of *clopen* (i.e. closed and open) sets.

It can be shown that *X* arises as the Gelfand spectrum of some abelian von Neumann algebra iff it is *hyperstonean*, where we say that a compact Hausdorff *X* is:


This replaces the classification of abelian von Neumann algebras up to isomorphism by the classification of hyperstonean spaces up to homeomorphism, which is hardly an improvement (the only other area of mathematics where such wacky spaces appear is algebraic logic). However, we do obtain a nice relationship between the projection lattice of an abelian von Neumann algebra and its Gelfand spectrum (at this point please recall Theorem D.5 and surrounding text in Appendix D).

Theorem C.141. *The projection lattice* P(*M*) *of a von Neumann algebra M is Boolean iff M is abelian, in which case there is a homeomorphism*

$$
\Sigma(M) \cong \mathcal{P}(\bar{\mathcal{P}}(M))\tag{C.572}
$$

*between the Gelfand spectrum of M (as a commutative C\*-algebra) and the Stone spectrum of* P(*M*) *(as a Boolean lattice). Hence we have isomorphisms*

$$M \cong \mathbb{C}(\mathcal{P}(\mathcal{P}(M)));\tag{C.S73}$$

$$\mathcal{O}(\Sigma(M)) \cong \text{Idl}(\mathcal{P}(M)),\tag{C.S74}$$

*as (commutative) C\*-algebras and as frames, respectively.*

*Proof.* In the commutative case, the lattice operations in P(*M*) are given by

$$e \wedge f = ef;\tag{\text{C.575}}$$

$$e \lor f = e + f - ef;\tag{C.576}$$

$$e^{\perp} = 1\_M - e,\tag{C.577}$$

as may be verified by embedding *M* ⊂ *B*(*H*) and using the proof of Proposition C.136; eq. (C.577) is true for any *M*. One then finds that *M* is distributive, since

$$e \wedge (f \vee \mathbf{g}) = ef + \mathbf{eg} - ef\mathbf{g} = (e \wedge f) \vee (e \wedge \mathbf{g}),\tag{C.578}$$

and similarly with ∨ and ∧ swapped. Since P(*M*) is orthomodular for arbitrary von Neumann algebras *M* and is distributive if *M* is abelian, it follows that P(*M*) is Boolean. Conversely, if P(*M*) is Boolean, we may compute

$$(e \wedge (e \wedge f)^\perp)^\perp = (e \wedge (e^\perp \vee f^\perp))^\perp = ((e \wedge e^\perp) \vee (e \wedge f^\perp))^\perp = (e \wedge f^\perp)^\perp = e^\perp \vee f,$$

and since *f* ≤ *g*∨ *f* for any *g*, this implies *f* ≤ (*e*∧(*e*∧ *f*)⊥)⊥. Now *f* ≤ *g*<sup>⊥</sup> implies *f g* = *g f* = 0, so

$$f(e \wedge (e \wedge f)^\perp) = (e \wedge (e \wedge f)^\perp)f = 0. \tag{C.579}$$

If *g* ≤ *e*, then *e*∧*g* = *g*, hence *e*∧*g*<sup>⊥</sup> +*g* = *e*∧(1*<sup>M</sup>* −*g*) +*g* = *e*. So *g* = *e*∧ *f* gives

$$e - (e \wedge f) = e \wedge (e \wedge f)^\perp. \tag{C.580}$$

Using (C.579) - (C.580) finally yields

$$ef = ((e \wedge f) + e - (e \wedge f))f = (e \wedge f)f + (e \wedge (e \wedge f)^\perp)f = e \wedge f,\qquad \text{(C.581)}$$

and since *e*∧ *f* = *f* ∧*e*, we find *e f* = *f e* for any two projections *e*, *f* ∈ P(*M*). Hence *M* is abelian by Theorem C.135.

If we now realize the Gelfand spectrum Σ(*M*) as the multiplicative state space of *M*, and realize the Stone spectrum S (P(*M*)) as the space Pt(P(*M*)) of points of P(*M*), then a homeomorphism Σ(*M*) ∼= Pt(P(*M*)) arises as follows:


Finally, (C.573) and (C.574) follow from (C.572) and the Gelfand isomorphism (Theorem C.8) and eq. (D.35), respectively. See also Theorem C.168 below. -

Note that (C.574) is a special case of Corollary C.84, for if *M* is a commutative von Neumann algebra, and *H*(*M*) its frame of heriditary subalgebras, we have

$$H(M) \cong \text{Idl}(\mathcal{P}(M));\tag{C.582}$$

$$J \mapsto \{ e \in \mathcal{P}(M) \mid Me \subseteq J \}, \tag{C.583}$$

whose inverse maps an ideal *<sup>I</sup>* <sup>⊂</sup> <sup>P</sup>(*M*) to the norm-closure of <sup>+</sup> *<sup>e</sup>*∈*<sup>I</sup> Me* in *M*. In particular, if *J* is σ-weakly closed, then *J* = *Me* for a unique projection *e* ∈ P(*M*), in which case the right-hand side of (C.583) is just the principal ideal ↓ *e*. To see this special case, we quote a useful result about arbitrary von Neumann algebras:

Proposition C.142. *Let I be a* σ*-weakly closed left (right) ideal in a von Neumann algebra M. Then there is a unique projection e* ∈ P(*M*) *such that I* = *Me (I* = *eM).*

Indeed, *e* is the σ-weak limit of any approximate identity in *I*.

#### C.22 The Murray–von Neumann classification of factors

After this analysis of abelian von Neumann algebras, we now turn to their opposites, viz. factors. The main tool in the classification of factors, introduced by Murray and von Neumann, is a new partial ordering on the projection lattice P(*M*), which is defined for general von Neumann algebras *M*. Unlike the familiar partial ordering ≤ (see Proposition C.136), gives a total ordering on P(*M*) if *M* is a factor.

Definition C.143. *Let* P(*M*) *be the projection lattice of a von Neumann algebra M. We say that e* ∼ *f in* P(*M*) *iff there exists u* ∈ *M such that u*∗*u* = *e and uu*<sup>∗</sup> = *f . Subsequently, we write e f if there is e* ∈ P(*M*) *with e* ∼ *e and e* ≤ *f .*

It is easy to show that ∼ is an equivalence relation. The operator *u* in this definition is unitary from *eH* to *f H*, vanishes on (*eH*)⊥, and has range *f H*. Such an operator is therefore a *partial isometry* (cf. Definition A.27), with initial projection *e* and final projection *f* . It follows that a *necessary* condition for *e* ∼ *f* is that dim(*eH*) = dim(*f H*), but (unless *M* = *B*(*H*)) this is by no means *sufficient*, since the unitary *u* that maps *eH* to *f H is required to lie in M*. For example, if *H* = C ⊕ C, then *e* = diag(1,0) is equivalent to *f* = diag(0,1) with respect to *M* = *M*2(C), but not with respect to *M* = *D*2(C) = C⊕C (i.e., the diagonal 2×2 matrices).

To see how natural this definition is, consider a unitary representation *u* of a group *G* on *H*. If *Hi* ⊂ *H* is stable under *u*(*G*), *i* = 1,2, then the restrictions *ui* of *u* to *Hi* are unitarily equivalent precisely when *e*<sup>1</sup> ∼ *e*<sup>2</sup> with respect to *M* = *u*(*G*) (where *ei* is the projection onto *Hi*). Furthermore, *u*<sup>1</sup> is unitarily equivalent to a subrepresentation of *u*<sup>2</sup> iff *e*<sup>1</sup> *e*2. More generally, if *N* ⊂ *B*(*H*) is a von Neumann algebra, with stable subspaces *Hi*, *i* = 1,2, then the restrictions *Ni* to *Hi* are unitarily equivalent iff *e*<sup>1</sup> ∼ *e*<sup>2</sup> with respect to *M* = *N* , et cetera.

One may compare projections in *M* with sets and compare ≤, ∼, and with ⊆ (inclusion), ∼= (isomorphism), and → (the existence of an injective map), respectively. The Schroder–Bernstein Theorem of set theory (which von Neumann knew ¨ well) states that if *X* → *Y* and *Y* → *X*, then *X* ∼= *Y*. Similarly, it can be shown that:

Proposition C.144. *If e f and f e, then e* ∼ *f .*

The special role of factors with respect to the partial ordering now emerges.

Proposition C.145. *If M is a factor, then is a total ordering (i.e., e f or f e).*

The property of a factor that leads to this result is:

Lemma C.146. *Let M be a factor. For any nonzero projections e*, *f* ∈ P(*M*)*, there are nonzero projections e* , *f* ∈ P(*M*) *such that e* ≤ *e, f* ≤ *f , and e* ∼ *f .*

The first step in the Murray—von Neumann classification of factors is as follows:

Definition C.147. *A projection e in M is called* finite *if f* ∼ *e and f* ≤ *e for some f* ∈ P(*M*) *implies f* = *e, and* minimal *if f* ≤ *e, f* ∈ P(*M*)*, implies f* = *e or f* = 0*. Accordingly, a factor M is called* finite *iff* 1*<sup>M</sup> is finite,* semifinite *iff* 1*<sup>M</sup> majorizes a finite projection, and* purely infinite *iff all nonzero projections are infinite.*

For *M* = *B*(*H*), which is evidently a factor, a projection *e* is (in)finite iff dim(*eH*) is (in)finite, so that *B*(*H*) is finite iff *H* is finite-dimensional, and semifinite otherwise. Surprisingly, we will see that finite factors different from *Mn*(C) exist, as do semifinite factors different from *B*(-<sup>2</sup>). Even purely infinite factors (initailly defined as what was left out from the previous two cases) turn out to exist (even in physics).

We first rephrase Definition C.147 in terms of generalized traces.

Definition C.148. *A* trace *on a von Neumann algebra M is a map*

$$\text{tr}: \mathcal{M}\_{+} \to [0, \infty] \tag{C.584}$$

*satisfying*

$$\operatorname{tr}(\mathbb{A}\cdot a + b) = \mathbb{A}\cdot \operatorname{tr}(a) + \operatorname{tr}(b) \ (a, b \in M\_+, \mathbb{A} \ge 0; \tag{C.585})$$

$$\text{tr}(aa^\*) = \text{tr}(a^\*a) \ (a \in M). \tag{C.586}$$

*Equivalently,* tr(*uau*∗) = tr(*a*) *for all a* <sup>∈</sup> *<sup>M</sup>*<sup>+</sup> *and unitary u* <sup>∈</sup> *M (so that uau*<sup>∗</sup> <sup>∈</sup> *<sup>M</sup>*+*).*

*A trace is* finite *if* tr(*a*) < ∞ *for all a* ∈ *M*+*,* semifinite *if for any a* ∈ *M*<sup>+</sup> *there is a nonzero b* ≤ *a in M*<sup>+</sup> *for which* tr(*b*) < ∞*, and* infinite *otherwise.*

The usual trace Tr is a trace tr on *B*(*H*) in this new sense, which is finite iff dim(*H*) is finite. As we will see, other factors admit other traces. The following result could have been used as a definition of (semi)finite and purely infinite factors.

Proposition C.149. *A factor is (semi)finite iff it admits a faithful* σ*-weakly continuous (semi)finite trace, and is purely infinite otherwise.*

It can be shown that a finite trace on a factor is automatically σ-weakly continuous, so a factor is finite iff it admits a faithful finite trace. Hence we recover the fact that *B*(*H*) is finite iff dim(*H*) < ∞, and semifinite otherwise. For a completely different kind of trace, defined on factors remote from *B*(*H*), we turn to discrete groups *G*. For these, Haar measure is simple the counting measure, so that *L*2(*G*) = -<sup>2</sup>(*G*), and convolution (C.504) and involution (C.505), initially defined on *Cc*(*G*), are given by

$$f \ast g(\mathbf{x}) = \sum\_{\mathbf{y} \in G} f(\mathbf{x} \mathbf{y}^{-1}) g(\mathbf{y}); \quad f^\*(\mathbf{x}) = \overline{f(\mathbf{x}^{-1})}.\tag{C.587}$$

According to Definition C.119, the reduced group C\*-algebra *C*∗ *<sup>r</sup>* (*G*) is the normclosure of the <sup>∗</sup>-algebra in *B*(*L*2(*G*)) containing all operators

$$
\mu\_L^f(f)\Psi(\mathbf{x}) = \sum\_{\mathbf{y}\in G} f(\mathbf{y})\Psi(\mathbf{y}^{-1}\mathbf{x}) \ (f \in C\_c(G)).\tag{C.588}
$$

Thus *C*∗ *<sup>r</sup>* (*G*) is realized as a concrete C\*-algebra of operators on *B*(-<sup>2</sup>(*G*)), so that, following von Neumann himself, we may form the *group!von Neumann algebra*

$$W^\*(G) = C\_r^\*(G)''.\tag{C.589}$$

Theorem C.150. *The group von Neumann algebra W*∗(*G*) *of a countable group is a factor iff all nontrivial conjugacy classes in G (i.e., all except* {*e*}*) are infinite.*

In that case, we say that *G* has (or "is") *icc*, i.e., has *infinite conjugacy classes*.

*Proof.* From (C.587), for *f* ∈ *Cc*(*G*) we have *f* ∗ *g* = *g* ∗ *f* for each *g* ∈ *Cc*(*G*) iff *<sup>f</sup>*(*yxy*−1) = *<sup>f</sup>*(*x*) for all *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> *<sup>G</sup>*. In other words, *<sup>f</sup>* lies in the center of *Cc*(*G*) <sup>⊂</sup> *W*∗(*G*) iff *f* is constant on each conjugacy class of *G*. If *G* is icc, this implies that *f* can have support only at *e*, i.e., *f* = λ · δ*e*, λ ∈ C. Noting that δ*<sup>e</sup>* is the unit in the algebra *Cc*(*G*), this proves the claim, except for the fact that we should extend this argument from *Cc*(*G*) to *W*∗(*G*), which by Theorem C.127 is its strong closure.

The key to this extension is the fact that one has *f* ∗ *g*(*x*) = *Rx*−<sup>1</sup> *f* <sup>∗</sup>,*g* for *f* and *g* in *Cc*(*G*) , where *Rx f*(*y*) = *f*(*yx*) and the inner product is in -<sup>2</sup>(*G*). Hence

$$|f \ast g(\mathbf{x})| = |\langle R\_{\mathbf{x}^{-1}} f^\*, \mathbf{g} \rangle| \le \|R\_{\mathbf{x}^{-1}} f\|\_{2} \|\mathbf{g}\|\_{2} = \|f\|\_{2} \|\mathbf{g}\|\_{2},\tag{\text{C.590}}$$

so that the sum in (C.587) is actually defined and converges (absolutely) for *f*,*g* ∈ -<sup>2</sup>(*G*). This also shows that if *fn* strongly converges to some *<sup>a</sup>* <sup>∈</sup> *<sup>B</sup>*(-<sup>2</sup>(*G*)), i.e., *fn* ∗ψ − *a*ψ → 0 for each ψ ∈ -<sup>2</sup>(*G*), then *<sup>a</sup>*<sup>ψ</sup> <sup>=</sup> *<sup>f</sup>* <sup>∗</sup>ψ, where *<sup>f</sup>* <sup>∈</sup> -<sup>2</sup>(*G*) is the limit of (*fn*) seen as a sequence in -<sup>2</sup>(*G*). Hence *<sup>W</sup>*∗(*G*) <sup>⊂</sup> -<sup>2</sup>(*G*), and the above computation of the center of *W*∗(*G*) remains valid: we have *f* ∈ *W*∗(*G*)∩*W*∗(*G*) iff *f* is constant on each conjugacy class of *G*. Conversely, any *f* that is constant on some finite conjugacy class (different from {*e*}) and zero elsewhere is central without being a multiple of the unit. -

Whether or not *G* has icc, we have a map tr : *W*∗(*G*) → C, defined by

$$\text{tr}(f) = f(e),\tag{C.591}$$

which satisfies (C.585) - (C.586) and hence defines a finite trace on *W*∗(*G*). Also,

$$\text{tr}(f) = \langle \delta\_{\epsilon}, f \ast \delta\_{\epsilon} \rangle,\tag{C.592}$$

so this trace is σ-weakly continuous.

Corollary C.151. *If G has icc, W*∗(*G*) *is a finite factor non-isomorphic to any B*(*H*)*.*

Since *G* must obviously be infinite for it to have icc, *W*∗(*G*) is infinite-dimensional, and hence *W*∗(*G*) *Mn*(C) for any *n* ∈ N. Furthermore, if *H* is infinite-dimensional, then *B*(*H*) does not admit any σ-weakly continuous finite faithful trace:

Proposition C.152. *Any two nonzero* σ*-weakly continuous (semi)finite traces* tr,tr *on a (semi)finite factor are proportional, i.e.,* tr <sup>=</sup> <sup>λ</sup>tr *for some* <sup>λ</sup> <sup>∈</sup> <sup>R</sup>+*.*

See also Theorem C.155 below. Consequently, since Tr and tr are both σ-weakly continuous, and Tr(1*H*) = ∞ on *B*(*H*), whereas tr(1-<sup>2</sup>(*<sup>G</sup>*)) = 1 on *W*∗(*G*), we conclude that *W*∗(*G*) *B*(*H*) for any *H*. Note also that (still assuming that *G* has icc), all projections in *W*∗(*G*) are finite, and *W*∗(*G*) has no minimal projections (see below), whereas *B*(*H*) has both finite and infinite projections, and also has plenty of minimal projections, namely those with one-dimensional range.

Do such "icc" groups actually exist? In fact, there are infinitely many of them: each free group on *<sup>n</sup>* <sup>&</sup>gt; 1 generators is an example. Another example is the group <sup>S</sup><sup>∞</sup> of finite permutations of N. A *j*-*cycle* is a cyclic permutation of *j* objects (called the *carrier* of the cycle in question). Any element *<sup>p</sup>* of <sup>S</sup><sup>∞</sup> <sup>=</sup> <sup>∪</sup>*n*S*<sup>n</sup>* is a finite product of *j*-cycles with disjoint carriers, and for each *j* ∈ N, the number of *j*-cycles in such a decomposition of *<sup>p</sup>* is uniquely determined by *<sup>p</sup>*. Two permutations in S∞, then, are conjugate iff they have the same number of *j*-cycles, for all *j* ∈ N

We present the *type classification* of factors due to Murray and von Neumann.

Definition C.153. *A factor M is said to be of type:*

	- *Type* <sup>I</sup>*<sup>n</sup> (n* ∈ N*) if M is finite and* 1*<sup>M</sup> is the sum of n minimal projections.*
	- *Type* I<sup>∞</sup> *if M is type* I *and semifinite but not finite.*
	- *Type* II1 *if M is type* II *and finite.*
	- *Type* II<sup>∞</sup> *if M is type* II *and semifinite but not finite.*

A nice understanding of these types arises from a construction similar to the trace.

Definition C.154. *A* dimension function *on a von Neumann algebra M is a function d* : P(*M*) → [0,∞] *such that d*(*e*) < ∞ *iff e is finite, d*(*e*+ *f*) = *d*(*e*) +*d*(*f*) *if e f* = 0 *(i.e., eH* ⊥ *f H), and d*(*e*) = *d*(*f*) *if e* ∼ *f .*

Paraphrasing results in Murray and von Neumann's great series of papers, we have:

Theorem C.155. *For any von Neumann algebra M, the restriction of a trace to* P(*M*) *is a dimension function. If M* ⊂ *B*(*H*) *is a factor, with H separable, then:*

	- {0,1,2,...,*n*}*, for some n* ∈ N *(type* <sup>I</sup>*n).*
	- N∪∞ *(type* <sup>I</sup>∞*).*
	- [0,1] *(type* II1*).*
	- [0,∞] *(type* II∞*).*
	- {0,∞} *(type* III*).*

We may now strengthen the few examples we had so far in the following way:

Corollary C.156. • *If* dim(*H*) = *n, then B*(*H*) *is a factor of type* <sup>I</sup>*n.*


#### C.23 Classification of hyperfinite factors

Throughout this section we assume that our von Neumann algebras *M* ⊆ *B*(*H*) act on a *separable* Hilbert space *H*. We say that *M* is *hyperfinite* if *M* = (∪*nMn*), for a family of finite-dimensional von Neumann subalgebras *Mn* ⊂ *M* with *Mn* ⊂ *Mn*+1. For example, *M* = *B*(*H*) is hyperfinite. If *G* is a group such that *G* = ∪*nGn* for finite subgroups *Gn* <sup>⊂</sup> *Gn*+1, as is the case e.g. for the (icc) group <sup>S</sup><sup>∞</sup> <sup>=</sup> <sup>∪</sup>*n*S*<sup>n</sup>* of finite permutations of N, then the associated von Neumann algebra *W*∗(*G*) is hyperfinite.

Murray and von Neumann partly classified hyperfinite factors, as follows:

Theorem C.157. *Let M* ⊂ *B*(*H*) *be a hyperfinite factor.*


The unique hyperfinite II1-factor *<sup>W</sup>*∗(S∞), which turns out to be isomorphic to *W*∗(*G*) for *any* finitely generated icc group *G*, is usually called *R*. Similarly (and trivially), *B*(-<sup>2</sup>) ∼= *B*(*H* ) for any separable infinite-dimensional Hilbert space *H* .

An example of a hyperfinite II<sup>∞</sup> factor is also quickly found, viz. *M* = *R*⊗*B*(-2), but Murray and von Neumann were unable to *classifiy* such factors. About type III, they knew almost nothing, except for a couple of examples from ergodic theory. Between 1971–1975, Connes made two decisive steps forward in this area:

	- There is a unique hyperfinite II<sup>∞</sup> factor, namely *R*⊗*B*(-2).
	- There is a unique hyperfinite III1 factor (Connes and Haagerup).
	- There is a unique hyperfinite III<sup>λ</sup> factor for each λ ∈ (0,1).
	- There is an infinite family of hyperfinite III0 factors, completely classified by the so-called *flow of weights* introduced by Connes and Takesaki.

We list III1 separately from III<sup>λ</sup> for λ ∈ (0,1) for two reasons: first, "hyperfinite III1" turns out to be *the* factor occurring in quantum field theory and quantum statistical mechanics of infinite systems, whereas III<sup>λ</sup> for λ ∈ (0,1) seems artificial) and second, the proof of uniqueness of the hyperfinite III1 factor is much more difficult.

An important technical tool of Connes was his own profound discovery that a von Neumann algebra *M* ⊂ *B*(*H*) is hyperfinite iff it is *injective*, in that there exists a σ-weakly continuous *conditional expectation E* : *B*(*H*) → *M*, that is, a linear map *E* : *B*(*H*) → *B*(*H*) such that *E*(*a*) ∈ *M* and *E*(*a*∗) = *E*(*a*)<sup>∗</sup> for all *a* ∈ *B*(*H*), *<sup>E</sup>*<sup>2</sup> <sup>=</sup> *<sup>E</sup>*, and *E* <sup>=</sup> 1. It follows that *<sup>E</sup>*(*abc*) = *aE*(*b*)*<sup>c</sup>* for all *<sup>a</sup>*, *<sup>c</sup>* <sup>∈</sup> *<sup>M</sup>*, *<sup>b</sup>* <sup>∈</sup> *<sup>B</sup>*(*H*). The equivalence of hyperfiniteness and injectivity implies, for example, that if *M* = *N* ⊗*B*(-<sup>2</sup>) is hyperfinite, then so is *N*. Another crucial tool was the *Tomita–Takesaki theory*, which we briefly summarize (this theory was paralleled by simultaneous and independent work in mathematical physics by the German-Dutch mathematical physics trio Haag–Hugenholtz–Winnink, which among other things allowed a direct definition of thermal equilibrium states in infinite volume, see §9.6.

Definition C.158. *A von Neumann algebra M* ⊂ *B*(*H*) *is in* standard form *if H contains a unit vector* Ω *that is cyclic and separating for M.*

Recall that Ω is *separating* for *M* if *a*Ω = 0 for all nonzero *a* ∈ *M*, and that Ω is *cyclic* for *M* iff it is separating for *M* . Any von Neumann algebra can be brought into standard form. For separable *H*, this follows by picking an injective density operator ρ on *H*, whose associated state ω(*a*) = Tr(ρ*a*) is faithful (in that ω(*a*∗*a*) > 0 for all nonzero *a* ∈ *M*), and passing to the GNS-representation πω(*M*) ∼= *M*. For example, *M* = *B*(*H*) acting on *H* is not in standard form, but acting on *B*2(*H*) by *left* multiplication it is, where *B*2(*H*) is the Hilbert space of Hilbert–Schmidt operators on *H* with the familiar inner product *a*,*b* = Tr(*a*∗*b*). If ρ ∈ *B*1(*H*) is an injective density operator on *<sup>H</sup>*, then <sup>Ω</sup> <sup>=</sup> <sup>√</sup><sup>ρ</sup> <sup>∈</sup> *<sup>B</sup>*2(*H*) brings *<sup>M</sup>* into standard form. In this case, *M* ∼= *B*(*H*)*op* (where the suffix "*op*" means that multiplication is done in the opposite order, i.e. *ab* in *B*(*H*)*op* is equal to *ba* in *B*(*H*)), which acts on *B*2(*H*) by *right* multiplication. If *H* = C*n*, one simply has *B*(*H*) = *B*2(*H*) = *Mn*(C).

Let *M* ⊂ *B*(*H*) be in standard form. Tomita introduced the (unbounded) *antilinear* operator *S* as the closure of the operator *S*<sup>0</sup> having domain *D*(*S*0) = *M*Ω and action

$$S\_0(a\Omega) = a^\*\Omega.\tag{C.593}$$

This domain is dense because Ω is cyclic for *M*, the action is well defined since Ω is separating for *M*, and *S*<sup>0</sup> indeed turns out to be closable, with closure *S*. Any closed operator *a* has a polar decomposition *a* = *v*|*a*|, where *v* is a partial isometry and <sup>|</sup>*a*<sup>|</sup> <sup>=</sup> <sup>√</sup>*a*∗*a*. We write the polar decomposition of the above operator *<sup>S</sup>* as

$$\mathbf{S} = J \Delta^{1/2},\tag{\text{C.594}}$$

where *J* is an *antilinear* partial isometry, and Δ = *S*∗*S*. Since *S* is injective with dense range, *<sup>J</sup>* is actually anti-unitary, satisfying *<sup>J</sup>*<sup>∗</sup> <sup>=</sup> *<sup>J</sup>* and *<sup>J</sup>*<sup>2</sup> <sup>=</sup> 1. Furthermore, <sup>Δ</sup> <sup>≥</sup> 0, so that <sup>Δ</sup>*it* is well defined for *<sup>t</sup>* <sup>∈</sup> <sup>R</sup>: writing <sup>Δ</sup> <sup>=</sup> exp(*h*) for the self-adjoint operator *h* = logΔ, we have Δ*it* = exp(*ith*). We then have the *Tomita–Takesaki Theorem*:

Theorem C.159. *Let M* ⊂ *B*(*H*) *be a von Neumann algebra in standard form. Then:*


The image of R in Aut(*M*) by α is called the *modular group* of *M* associated with the cyclic and separating vector Ω (or rather, with the associated σ-weakly continuous faithful state ω). Simple examples show that the modular group explicitly depends on the vector Ω. In his thesis, Connes analyzed the dependence of α on Ω, and showed it was innocent. To state the simplest version of his result, assume that *H* contains two different vectors Ω<sup>1</sup> and Ω2, each of which is cyclic and separating for *M*. We write α(*i*) *<sup>t</sup>* for the modular group derived from Ω*i*, *i* = 1,2.

Theorem C.160. *There is a family Ut of unitary operators in M (t* ∈ R*), such that*

$$\mathfrak{a}\_{\mathfrak{l}}^{(1)}(a) = U\_{\mathfrak{l}} \mathfrak{a}\_{\mathfrak{l}}^{(2)}(a) U\_{\mathfrak{l}}^{\*};\tag{C.595}$$

$$U\_{l+s} = U\_s \mathfrak{a}\_s^{(2)}(U\_l). \tag{C.596}$$

*Proof.* The proof of this theorem is Connes's favourite (as he declared in an interview), so we present it in some detail. It is based on the following idea. Extend *M* to Mat2(*M*), i.e., the von Neumann algebra of 2×2 matrices with entries in *M*, and let Mat2(*M*) act on *H*<sup>2</sup> = *H* ⊕ *H* in the obvious way. Subsequently, let Mat2(*M*) act on *H*<sup>4</sup> = *H* ⊕*H* ⊕*H* ⊕*H* = *H*<sup>2</sup> ⊕*H*<sup>2</sup> by simply doubling the action on *H*2. The vector Ω = (Ω1,0,0,Ω2) ∈ *H*<sup>4</sup> is then cyclic and separating for Mat2(*M*), with corresponding modular operator Δ = diag(Δ1,Δ4,Δ3,Δ2). Here Δ<sup>1</sup> and Δ<sup>2</sup> are just the operators on *H* originally defined by Ω<sup>1</sup> and Ω2, respectively, and Δ<sup>3</sup> and Δ<sup>4</sup> are certain operators on *H*. Denoting elements of Mat2(*M*) by

$$\mathbf{a} = \begin{pmatrix} a\_{11} \ a\_{12} \\ a\_{21} \ a\_{22} \end{pmatrix},\tag{C.597}$$

we then have

$$
\Delta^{\dot{\boldsymbol{\alpha}}} \begin{pmatrix} \mathbf{a} \ \mathbf{0} \\ \mathbf{0} \ \mathbf{a} \end{pmatrix} \Delta^{-\dot{\boldsymbol{\alpha}}} = \begin{pmatrix} \mathfrak{d}\_{\mathbf{r}}^{(1)}(\mathbf{a}) & \mathbf{0} \\ \mathbf{0} & \mathfrak{d}\_{\mathbf{r}}^{(2)}(\mathbf{a}) \end{pmatrix}; \tag{C.598}$$

$$
\tilde{a}\_{\mathbf{r}}^{(1)}(\mathbf{a}) = \begin{pmatrix}
\Delta\_1^{\mathrm{il}} a\_{11} \Delta\_1^{-\mathrm{il}} & \Delta\_1^{\mathrm{il}} a\_{12} \Delta\_4^{-\mathrm{il}} \\
\Delta\_4^{\mathrm{il}} a\_{21} \Delta\_1^{-\mathrm{il}} & \Delta\_4^{\mathrm{il}} a\_{22} \Delta\_4^{-\mathrm{il}}
\end{pmatrix};\tag{\text{C.599}}
$$

$$\mathfrak{d}\_{\mathbf{r}}^{(2)}(\mathbf{a}) = \begin{pmatrix} \Delta\_3^{\mathrm{i}t} a\_{11} \Delta\_3^{-\mathrm{i}t} & \Delta\_3^{\mathrm{i}t} a\_{12} \Delta\_2^{-\mathrm{i}t} \\ \Delta\_2^{\mathrm{i}t} a\_{21} \Delta\_3^{-\mathrm{i}t} & \Delta\_2^{\mathrm{i}t} a\_{22} \Delta\_2^{-\mathrm{i}t} \end{pmatrix}. \tag{C.600}$$

But by Theorem C.159, the right-hand side of (C.598) must be of the form diag(b,b) for some <sup>b</sup> <sup>∈</sup> Mat2(*M*), so that <sup>α</sup>˜ (1) *<sup>t</sup>* (a) = <sup>α</sup>˜ (2) *<sup>t</sup>* (a). This allows us to replace Δ*it* <sup>4</sup> *<sup>a</sup>*22Δ−*it* 4 in (C.599) by Δ*it* <sup>2</sup> *<sup>a</sup>*22Δ−*it* <sup>2</sup> . We then put*Ut* = Δ*it* <sup>1</sup> <sup>Δ</sup>−*it* <sup>4</sup> , which, unlike either Δ*it* <sup>1</sup> or <sup>Δ</sup>−*it* <sup>4</sup> , lies in *M*, because each entry in α˜ (1) *<sup>t</sup>* (a) must lie in *M* if all the *ai j* do, and here we have taken *a*<sup>12</sup> = 1. All claims of the theorem may then be verified using elementary computations with 2×2 matrices. For example, combining

$$
\begin{pmatrix} a \ 0 \\ 0 \ 0 \end{pmatrix} = \begin{pmatrix} 0 \ 1 \\ 0 \ 0 \end{pmatrix} \begin{pmatrix} 0 \ 0 \\ 0 \ a \end{pmatrix} \begin{pmatrix} 0 \ 0 \\ 1 \ 0 \end{pmatrix} \tag{C.601}
$$

with the property α˜ (1) *<sup>t</sup>* (ab) = <sup>α</sup>˜ (1) *<sup>t</sup>* (a)α˜ (1) *<sup>t</sup>* (b), we recover (C.595). Using the identity

$$
\begin{pmatrix} 0 \ U\_l \\ 0 \ 0 \end{pmatrix} = \begin{pmatrix} 0 \ 1 \\ 0 \ 0 \end{pmatrix} \begin{pmatrix} 0 & 0 \\ 0 \ U\_l & \end{pmatrix}, \tag{C.602}
$$

evolving each side to time *s* yields (C.596). A proof from *The Book*! -

We say that an automorphism γ : *M* → *M* is *inner* if there exists a unitary element *u* ∈ *M* such that γ(*a*) = *uau*<sup>∗</sup> for all *a* ∈ *M*. The inner automorphisms of *M* form a normal subgroup Inn(*M*) of the group Aut(*M*) of all automorphisms, with quotient Out(*M*) = Aut(*M*)/Inn(*M*). Theorem C.160 shows that the image π(α(R)) of the modular group in Out(*M*) under the canonical projection π : Aut(*M*) → Out(*M*) is independent of Ω, and invariants of this image will be invariants of *M* itself.

Such invariants are trivial if *M* is a factor of type I or II, since in that case π(α(R)) = {*e*}; to see this in the finite case (i.e., type <sup>I</sup>*<sup>n</sup>* or type II1), take a finite trace τ on *M* and check that Δ = 1 for πτ (*M*) ∼= *M*. For the semifinite but not finite case (i.e., type I<sup>∞</sup> or type II∞), a slight generalization of the GNS-construction leads to the same conclusion. To find invariants for type III factors, we therefore need to extract information from the modular group *t* → α*<sup>t</sup>* up to inner automorphisms.

Definition C.161. *Let* α : R → Aut(*M*) *be a continuous action of* R *on M, defining:*

$$M^{\mathfrak{a}} = \{ \mathfrak{x} \in M \mid \mathfrak{a}\_{\mathfrak{k}}(\mathfrak{x}) = \mathfrak{x} \,\forall t \in \mathbb{R} \};\tag{C.603}$$

$$M\_e = \{ \mathbf{x} \in M \mid \mathbf{x} e = e \mathbf{x} = \mathbf{x} \} \text{ (} e \in \mathcal{P}(M^\alpha) \text{)}. \tag{\text{C.604}}$$


The Connes spectrum Γ (α) is a closed subgroup of R<sup>+</sup> <sup>∗</sup> , which has the great virtue that if π(α(R)) = π(α (R)), then Γ (α) = Γ (α ). So if α is the modular group of *M* with respect to some state ω, then Γ (α) is independent of ω, and may therefore be called Γ (*M*). This invariant can also be defined through the usual spectrum of self-adjoint operators on Hilbert space. To this effect, Connes defined and proved

$$S(M) = \bigcap\_{\mathfrak{o}} \sigma(\Delta\_{\mathfrak{o}}) = \bigcap\_{0 \neq \epsilon \in \beta^{\mathfrak{p}}(M^{\mathfrak{a}})} \sigma(\Delta\_{\mathfrak{q}\_{\epsilon}}),\tag{C.605}$$

where the first intersection is over all σ-weakly continuous faithful states ω on *M*, whereas in the second one takes *a fixed* σ-weakly continuous faithful state ϕ on *M*, and restricts it to ϕ*<sup>e</sup>* = ϕ|*Me* . Furthermore, Δω denotes the operator Δ on *H*ω, defined with respect to the usual cyclic unit vector Ωω of the GNS-construction, etc. If *M* is a type <sup>I</sup> or II factor, one has *S*(*M*) = {1}, whereas 0 ∈ *S*(*M*) iff *M* is type III.

Connes showed that <sup>Γ</sup> (*M*) = *<sup>S</sup>*(*M*)∩R<sup>+</sup> <sup>∗</sup> , and the known classification of closed subgroups of R<sup>+</sup> <sup>∗</sup> yields his path-breaking parametrization of type III factors:

Definition C.162. *Let M be a type* III *factor. Then M is said to be of type:*


The unique hyperfinite III1 factor appears throughout algebraic quantum field theory, where it plays the role of a universal algebra of localized observables.

#### C.24 Other special classes of C\*-algebras

There are many other special classes of C\*-algebras apart from von Neumann algebras and commutative C\*-algebras. The classes we consider here contain both commutative and non-commutative C\*-algebras; in the spirit of (exact) Bohrification, whenever possible we try to characterize them through properties of their (maximal) commutative subalgebras. Like the von Neumann algebras already studied, each class in this section is sandwiched between the *finite-dimensional* C\*-algebras, i.e. those C\*-algebras that are finite-dimensional as a vector space (which it contains), and the *real rank zero* C\*-algebras defined below (in which it is contained).

Finite-dimensional C\*-algebras admit a straightforward classification:

Theorem C.163. *Every finite-dimensional C\*-algebra A is isomorphic to a direct sum of matrix algebras, i.e., A* ∼= ⊕*kMnk* (C)*, where nk* ∈ N*, and the sum is finite.*

*Proof.* Let *A* be a finite-dimensional C\*-algebra, and take the injective representation π = 3 <sup>ω</sup>∈*P*(*A*) πω on *Hc* <sup>=</sup> <sup>3</sup> <sup>ω</sup>∈*P*(*A*) *H*ω, where *P*(*A*) is the pure state space of *A*; cf. the last stage of the proof of Theorem C.87. The proof now unfolds:


The real rank of a C\*-algebra *A* is a non-commutative generalization of the (Lebesgue) *covering dimension* of a non-empty space *X*, defined as follows. First say that dim(*X*) ≤ *n* iff every open cover of *X* has an open refinement U for which every *x* ∈ *X* is contained in at most *n* + 1 elements of U . We then say that dim(*X*) = *n* iff dim(*X*) ≤ *n* but dim(*X*) *n*−1 (such *n* need not exist).

If *X* is a compact Hausdorff space, then dim(*X*) = *n* iff *n* is the smallest integer *<sup>n</sup>* such that for every *<sup>f</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*,R*n*+1) and <sup>ε</sup> <sup>&</sup>gt; 0, there is *<sup>g</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*,R*n*+1) such that *<sup>g</sup>*(*x*) <sup>=</sup> 0 for all *<sup>x</sup>* and *<sup>f</sup>* <sup>−</sup> *<sup>g</sup>*<sup>∞</sup> <sup>&</sup>lt; <sup>ε</sup>, where *<sup>f</sup>* <sup>∞</sup> <sup>=</sup> sup*x*∈*<sup>X</sup>* {| *<sup>f</sup>*(*x*)|}. If no such *<sup>n</sup>* exists, we say that dim(*X*) = <sup>∞</sup>. If *<sup>g</sup>* : *<sup>X</sup>* <sup>→</sup> <sup>R</sup>*n*+<sup>1</sup> is described by its coordinates (*g*1,...,*gn*+1), then *<sup>g</sup>*(*x*) <sup>=</sup> 0 iff <sup>∑</sup>*n*+<sup>1</sup> *<sup>k</sup>*=<sup>1</sup> *gk*(*x*)<sup>2</sup> <sup>&</sup>gt; 0, or equivalently, <sup>∑</sup>*<sup>k</sup> <sup>g</sup>*<sup>2</sup> *<sup>k</sup>* is invertible in *<sup>C</sup>*(*X*). We may replace the usual norm v in <sup>R</sup>*n*+<sup>1</sup> by the equivalent max-norm, i.e., v = max*i*{|*vi*|}, where v = (*v*1,..., *vn*+1). If we do so, we may generalize the covering dimension to possibly noncommutative unital C\*-algebras, as follows.

Let *<sup>A</sup><sup>n</sup>* <sup>=</sup> *<sup>A</sup>*⊕···⊕*<sup>A</sup>* (with *<sup>n</sup>* terms) be the C\*-algebra *<sup>A</sup>*×···×*<sup>A</sup>* with pointwise operations and norm (*a*1,...,*an*) <sup>=</sup> max*i*{*ai*}. Let *<sup>Q</sup>*(*An*) be the set of all *selfadjoint* elements (*a*1,...,*an*) in *An* for which ∑*<sup>i</sup> a*<sup>2</sup> *<sup>i</sup>* is invertible (i.e. in *A*). The *real rank* rr(*A*) of a unital C\*-algebra *A* is defined as the smallest integer *n* for which *<sup>Q</sup>*(*An*+1) is dense in *<sup>A</sup>n*+<sup>1</sup> sa , i.e., if for every <sup>a</sup> <sup>∈</sup> *<sup>A</sup>n*+<sup>1</sup> sa and <sup>ε</sup> <sup>&</sup>gt; 0, there is <sup>b</sup> <sup>∈</sup> *<sup>Q</sup>*(*An*+1) such that a−b < ε. If no such *n* exists, we define rr(*A*) = ∞. If *A* has no unit, we define its real rank as rr(*A*) = rr(*A*˙), i.e., as the real rank of its unitization.

Taking *A* = *C*(*X*), it follows from the previous paragraph that

$$\text{rr}(C(X)) = \text{dim}(X). \tag{C.606}$$

Now dim(*X*) = 0 iff *X* has a basis of clopen sets, and if *X* is compact Hausdorff, then dim(*X*) = 0 iff *X* is a Stone space. Hence from (C.606) we immediately have:

Proposition C.164. *If A is a commutative C\*-algebra,* rr(*A*) = 0 *iff* Σ(*A*) *is Stone.*

This makes dimension zero somewhat pathological. On the other hand, for noncommutative C\*-algebras real rank zero is ubiquitous. Note that if *a* = *a*<sup>∗</sup> and *a*<sup>2</sup> is invertible, then its inverse is positive, too, and has a square-root which inverts *a*. Thus *A* has real rank zero iff its invertible self-adjoint elements are dense in *A*sa.

Proposition C.165. *Any von Neumann algebra has real rank zero.*

*Proof.* For *a* ∈ *A*sa and ε > 0, with *A* ⊆ *B*(*H*), use Theorem B.102 to define

$$b = (\mathrm{id}\_{\sigma(a)} + (\frac{1}{2}\mathfrak{e} \cdot \mathbf{1}\_{\sigma(a)} - \mathrm{id}\_{\sigma(a)}) \cdot \mathbf{1}\_{[-\mathfrak{e}/2, \mathfrak{e}/2]})(a). \tag{C.607}$$

Using (B.322), we may then compute

$$\begin{split} \| |a - b| &\le \| |\frac{1}{2}\mathbf{c} \cdot \mathbf{1}\_{\sigma(a)} - \mathrm{id}\_{\sigma(a)} \cdot \mathbf{1}\_{[-\varepsilon/2, \varepsilon/2]} \|\_{\infty} \\ &\le \| |\frac{1}{2}\mathbf{c} \cdot \mathbf{1}\_{\sigma(a)} \|\_{\infty} + \| \mathrm{id}\_{\sigma(a)} \cdot \mathbf{1}\_{[-\varepsilon/2, \varepsilon/2]} \|\_{\infty} \\ &\le \frac{1}{2}\mathbf{c} + \frac{1}{2}\mathbf{c} = \varepsilon. \end{split} \tag{\text{C.608}}$$

Writing (C.607) as *b* = *f*(*a*), the function *f* ∈ B(σ(*a*)) satisfies *f*(*x*) = *x* if *x* ∈/ [−ε/2, ε/2] and *f*(*x*) = <sup>1</sup> <sup>2</sup> ε if *x* ∈ [−ε/2, ε/2]; either way, *f*(*x*) = 0. Hence *f* is invertible in B(σ(*a*)), and therefore *b* = *f*(*a*) is invertible in *W*∗(*a*) and in *B*(*H*).-

We now turn to classes of C\*-algebras that are sandwiched between the finitedimensional ones at the lower end and those with real rank zero at the upper end.

Definition C.166. *Let A be a unital C\*-algebra. Then A is said to be:*


Here a subset *S* of a poset *P* is *upward directed* if for each *x*, *y* ∈ *S* there is *z* ∈ *S* such that *x* ≤ *z* and *y* ≤ *z* (for example, this is true in a complete lattice).

Furthermore, the *right-annihilator R*(*S*) of *S* ⊂ *A* is defined as

$$R(\mathbb{S}) = \{a \in A \mid ba = 0 \,\forall b \in \mathbb{S}\},\tag{\text{C.609}}$$

and *R*(*a*) ≡ *R*({*a*}); in the presence of an involution, equivalent definitions may be given in terms of the left-annihilator. In all cases, the projection *e* is unique. Since Rickart himself already showed that *A* is Rickart iff for each nonempty *countable* subset *S* ⊂ *A* there is *e* ∈ P(*A*) so that *R*(*S*) = *eA*, the difference between Rickart and AW\* lies in the countability assumption on *S* in the former but not in the latter.

It is known that if a C\*-algebra *A* has a faithful representation on a separable Hilbert space, then it is a Rickart C\*-algebra iff it is an AW\*-algebra, but otherwise these classes are different. Similarly, an AW\*-algebra is a W\*-algebra iff it has a separating family of normal states, where normality of functionals on AW\* algebras is defined as in Definition 4.11, i.e. through complete additivity on orthogonal familes of projections, which always have an upper bound (cf. Theorem C.169 below). This is the case in all examples relevant to mathematical physics, but set-theoretically the class of AW\*-algebras has higher cardinality than the class of W\*-algebras it contains. It is generally believed that a C\*-algebra is Rickart iff it is monotone σ-complete, and that it is AW\* iff it is monotone complete, but there are neither proofs of nor counterexamples to these claims. We have the inclusions:

*W*<sup>∗</sup> ⊂ monotone complete ⊆ *AW*<sup>∗</sup> ⊂ Rickart ⊂ real rank zero; *AF* ⊂ real rank zero; scattered ⊂ real rank zero.

Scattered C\*-algebras may alternatively be characterized as those C\*-algebras on which every state is a *w* ∗ -convergent convex sum of pure states; this condition is far stronger than what the Krein–Milman theorem gives, namely that every state is a *w* ∗ -limit of some net consisting of finite convex sums of pure states. For example, for any Hilbert space the compact operators *B*0(*H*) form a scattered C\*-algebra (extending the definition of the latter to the non-unital case as appropriate).

Two kinds of results are of interest for Bohrification: one is the topological characterization of the commutative case of each class, the other is the characterization of the class itself through properties of its commutative subalgebras. Without proof we state what is known in this respect.

Theorem C.167. *Let A be a commutative unital C\*-algebra. Then A is:*


Here we used the convention that a *Stone space* is a zero-dimensional compact Hausdorff space (equivalently, it is compact Hausdorff and totally disconnected in the sense that the only connected subsets are points). A *(*σ*-) stonean space* is a Stone space with the additional property that Clopen(Σ(*A*)) is a (σ-) complete lattice (equivalently, a stonean space is a compact Hausdorff space that is extremally disconnected in that the closure of each open set is open). Furthermore, a space is *hyperstonean* if it is stonean, and for any nonzero *<sup>f</sup>* <sup>∈</sup>*C*(*X*,R+) there exists a completely additive positive measure μ such that μ(*f*) > 0. In particular, in the commutative case the classes *AF* and real rank zero coincide, as do AW\* and monotone complete algebras. A space *X* is called *scattered* if each non-empty closed subset *C* ⊂ *X* contains an isolated point (i.e., a point *x* ∈ *C* with an open neighbourhood *U* such that *U* ∩*C* = {*x*}). If *X* is scattered, then it is totally disconnected. An example of a compact scattered space is (1/N)∪ {0} with the relative topology from R.

This leads to the following generalization and extension of Theorem C.141.

Theorem C.168. *Let A be a commutative unital C\*-algebra. The projections* P(*A*) *in A form a Boolean lattice, which is related to the Gelfand spectrum* Σ(*A*) *through*

$$
\mathcal{O}^{\varnothing}(A) \cong \text{Clopen}(\Sigma(A)).\tag{C.610}
$$

*If A is also AF, then its Gelfand spectrum* Σ(*A*) *is a Stone space, and we have*

$$
\Sigma(\mathcal{A}) \cong \mathcal{P}(\mathcal{P}(\mathcal{A}));\tag{\mathsf{C.611}}
$$

$$\mathcal{O}(\Sigma(A)) \cong \text{Idl}(\mathcal{P}(A));\tag{C.612}$$

$$A \cong \mathcal{C}(\mathcal{P}(\mathcal{P}(A))),\tag{C.613}$$

*as topological spaces, frames, and (commutative) C\*-algebras, respectively. Conversely, for any Boolean lattice L the C\*-algebra C*(S (*L*)) *is AF, and*

$$L \cong \bar{\mathcal{P}}(\mathbb{C}(\mathcal{F}(L))).\tag{\mathsf{C.614}}$$

*Proof.* Using the Gelfand isomorphism *A* ∼= *C*(Σ(*A*)), eq. (C.610) follows from

$$\mathcal{O}^{\mathcal{J}}(\mathbf{C}(X)) \cong \mathbf{C} \mathbf{l} \text{open}(X), \tag{\text{C.615}}$$

where *<sup>X</sup>* is some compact Hausdorff space. Indeed, if *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>* <sup>∈</sup> *<sup>C</sup>*(*X*), then *<sup>e</sup>* must be {0,1}-valued, so it must be *<sup>e</sup>* <sup>=</sup> <sup>1</sup>*<sup>U</sup>* for some *<sup>U</sup>* <sup>⊂</sup> *<sup>X</sup>*, viz. *<sup>U</sup>* <sup>=</sup> *<sup>e</sup>*−1({1}). Since *e* ∈*C*(*X*)is continuous,*U* must be clopen. Conversely, for each*U* ∈Clopen(*X*), the function 1*<sup>U</sup>* <sup>∈</sup>*C*(*X*) is a projection, and the maps *<sup>U</sup>* → <sup>1</sup>*<sup>U</sup>* and *<sup>e</sup>* → *<sup>e</sup>*−1({1}) are each other's inverse. Theorem D.5 then implies that P(*A*) is Boolean.

If *A* ∼= *C*(*X*) is *AF*, then *C*(*X*)=(∪λ*A*<sup>λ</sup> )<sup>−</sup> is the norm-closure of the union of finite-dimensional C\*-algebras *A*<sup>λ</sup> , which union by the Stone–Weierstrass theorem separates points of *X*. Since each *A*<sup>λ</sup> is the linear span of its projections, the finitedimensional projections ∪λP(*A*<sup>λ</sup> ) already separate points in *X*, and this in turn implies that *X* is *totally separated*, i.e., for each *x* = *y* ∈ *X*, there is *U* ∈ Clopen(*X*) such that *x* ∈ *U* and *y* ∈/ *U*. Since a compact Hausdorff space is zero-dimensional (and hence Stone) iff it is totally separated, *X* is a Stone space.

Again using Theorem C.8, we only need to prove (C.611) in the special case

$$X \cong \mathcal{P}(\mathcal{P}(\mathbf{C}(X))),\tag{\text{C.616}}$$

where *X* is a Stone space; this follows from (C.615) and Theorem D.5. Eq. (C.612) follows from (D.35), whilst (C.613) is immediate from (C.611) and Theorem C.8.

Finally, using Theorem D.5 we see that (C.614) reduces to (C.615), so we only need to prove that *C*(*X*) is *AF* for any Stone space *X*. This is just the above proof of the converse ran backwards: since *X* is totally separated, for each *x* = *y* we find *U* ∈ Clopen(*X*)separating *x* and *y*, so that also the associated projection 1*<sup>U</sup>* separates *x* and *y*, and hence P(*C*(*X*)) separates *X*. Taking Λ to label the finite subsets of P(*C*(*X*)), and *A*<sup>λ</sup> to be the finite-dimensional C\*-algebra generated by λ ∈ Λ, by Stone–Weierstrass we have *C*(*X*)=(∪λ*A*<sup>λ</sup> )−. Hence *C*(*X*) is *AF*. -.

Theorem C.169. *The claim that a unital C\*-algebra lies in class* X *iff each of its maximal abelian* <sup>∗</sup>*-subalgebras lies in class* X *is true for the following classes:*


The claim is false for *AF*-algebras, true for monotone complete C\*-algebras iff these coincide with AW\*-algebras, false for real rank zero C\*-algebras, and unknown for W\*-algebras, which we therefore state as a *conjecture*:

*A C\*-algebra is a W\*-algebra iff each maximal abelian* ∗*-subalgebra is a W\*-algebra.*

Proposition C.170. *For any C\*-algebra A and any projections e*, *f* ∈ P(*A*)*, we have e f* <sup>=</sup> *e iff e* <sup>≤</sup> *f (with partial ordering* <sup>≤</sup> *as defined in A*sa *via A*+*, cf.* §*C.7).*

*Proof.* As explained above (C.93), if *a*<sup>1</sup> ≤ *a*2, then *b*∗*a*1*b* ≤ *b*∗*a*2*b*, so *e* ≤ *f* implies

$$e(1\_A - f)e(1\_A - f) \le (1\_A - f)f(1\_A - f) = 0.\tag{C.617}$$

However, since *<sup>e</sup>*<sup>2</sup> <sup>=</sup> *<sup>e</sup>*<sup>∗</sup> <sup>=</sup> *<sup>e</sup>*, with *<sup>c</sup>* <sup>=</sup> *<sup>e</sup>*(1*<sup>A</sup>* <sup>−</sup> *<sup>f</sup>*) we have (1*<sup>A</sup>* <sup>−</sup> *<sup>f</sup>*)*e*(1*<sup>A</sup>* <sup>−</sup> *<sup>f</sup>*) = *<sup>c</sup>*∗*c*, and hence (1*<sup>A</sup>* − *f*)*e* = 0 (as *c*∗*c* ≥ 0), or *e* = *f e*. Taking adjoints gives *e f* = *e*, and consequently *e f* = *f e*. Conversely, if *e f* = *e*, we have *e f* = *f e* and hence

$$\left(f - e\right)^{2} = f - 2e + e = f - e.\tag{C.618}$$

Of course, *f* − *e* = (*f* − *e*)∗, so that (C.618) makes *f* − *e* a projection. Since any projection lies in *<sup>A</sup>*+, we have *<sup>f</sup>* <sup>−</sup>*<sup>e</sup>* <sup>≥</sup> 0, and hence *<sup>e</sup>* <sup>≤</sup> *<sup>f</sup>* . -

The set of projections P(*A*) in a C\*-algebra is always a poset in the order ≤, but it is not automatically a lattice. It is a σ-complete lattice if *A* is Rickart, and hence also in all "lower" classes, including von Neumann algebras (cf. Proposition C.136), where P(*A*) is even a complete lattice.

#### C.25 Jordan algebras and (pure) state spaces of C\*-algebras

Let *A* be a unital C\*-algebra. As we know, the *state space S*(*A*) is the set of all states on *A*, seen as a compact convex set in the *w*∗-topology inherited from the embedding *S*(*A*) ⊂ *A*<sup>∗</sup> (note that *S*(*A*) fails to be compact if *A* lacks a unit). To see which information *S*(*A*) carries about *A*, we need to impoverish *A* as follows.

Definition C.171. *A* Jordan algebra *is a real* commutative *(but generally* non-associative*) algebra A whose product* ◦ *satisfies (writing a*<sup>2</sup> <sup>=</sup> *<sup>a</sup>* ◦ *a):*

$$a \circ (b \circ a^2) = (a \circ b) \circ a^2. \tag{C.619}$$

#### *A* JB-algebra *is a Jordan algebra that is also a (real) Banach space such that:*

$$\|a \circ b\| \le \|a\| \|b\|;\tag{C.620}$$

$$\|a\|^2 \le \|a^2 + b^2\|. \tag{C.621}$$

Given (C.620), axiom (C.621) is equivalent to *a*2≤*a*<sup>2</sup> <sup>+</sup>*b*2 and *a*2 <sup>=</sup> *a*2.

It is easy to see that the self-adjoint part *A*sa of any C\*-algebra *A* is a JB-algebra if we put *a*◦*b* = <sup>1</sup> <sup>2</sup> (*ab*+*ba*), cf. (5.14). If *A* and *B* are unital C\*-algebras, we say that a linear map ϕ : *A*sa → *B*sa is a *Jordan homomorphism* if it preserves ◦; to this effect it clearly suffices that ϕ(*a*2) = ϕ(*a*)<sup>2</sup> for each *a*. If ϕ in addition is bijective, then it is called a *Jordan isomorphism*; in that case its inverse is necessarily linear and also preserves te Jordan product ◦. A Jordan*Jordan automorphism* of a C\*-algebra *A* is a Jordan isomorphism *A*sa → *A*sa. Of course, we may complexify ϕ : *A*sa → *B*sa so as to obtain a <sup>C</sup>-linear map <sup>ϕ</sup><sup>C</sup> : *<sup>A</sup>* <sup>→</sup> *<sup>B</sup>* that equally well satisfies <sup>ϕ</sup>C(*a*2) = <sup>ϕ</sup>C(*a*)2, this time for all *a* ∈ *A* (rather than all *a* ∈ *A*sa). However, the conceptual point here is that quantum-mechanical observables are supposed to be self-adjoint, and that the Jordan product (but not the ordinary associative product) always preserves selfadjointness. Generalizing Proposition 5.19, we then have the key result:

Theorem C.172. *Let A and B be unital C\*-algebras. There is a bijective correspondence between Jordan isomorphisms* ϕ : *A*sa → *B*sa *and affine homeomorphisms f* : *S*(*B*) → *S*(*A*)*, given by f* = ϕ<sup>∗</sup> *(i.e. f*(ω)(*a*) = ω(ϕ(*a*))*). In particular, each affine homeomorphism of S*(*A*) *is induced by a Jordan automorphism of A.*

The proof is similar to Proposition 5.19; generalizing Lemma 5.20 we now have:

Lemma C.173. *Let A and B be unital C\*-algebras. Then f* = ϕ∗ *gives a bijective correspondence between affine bijections f* : *S*(*B*) → *S*(*A*) *and unital positive linear bijections* ϕ : *A*sa → *B*sa*. Moreover, if* ϕ : *A*sa → *B*sa *is a unital linear bijection, then* ϕ *is positive iff* ϕ *is isometric iff* ϕ *is a Jordan isomorphism.*

Most of the proof is practically the same as for Lemma 5.20 (so we omit it), expect for the last equivalence between invertible unital isometries and Jordan isomorphisms, which is deeper and relies on *Kadison's inequality* ϕ(*a*∗*a*) ≥ ϕ(*a*)∗ϕ(*a*) for positive unital linear maps ϕ between C\*-algebras and normal operators *a*.

A similar result is Hamhalter's generalization of *Dye's Theorem* to AW\*-algebras:

Theorem C.174. *Let A and B be AW\*-algebras and let* N : P(*A*) → P(*B*) *be an isomorphism of the corresponding orthocomplemented projection lattices that in addition preserves arbitrary suprema. If A has no summand isomorphic to either* C<sup>2</sup> *or M*2(C)*, then there is a unique Jordan isomorphism* J : *A*sa → *B*sa *that extends* N *(and hence Jordan isomorphisms are characterized by their values on projections).*

This generalizes Corollary 5.22 in the main text, but has a much more difficult proof. *Proof.* If *e*, *f* ∈ P(*A*) are orthogonal, then so are N(*e*) and N(*f*), so that

$$
\mathbb{N}(e+f) = \mathbb{N}(e) + \mathbb{N}(f). \tag{C.622}
$$

Gleason's Theorem for AW\*-algebras then gives a Jordan homomorphism

$$\mathsf{J}\_{(e,f)}: AW^\*(e,f)\_{\mathsf{sa}} \to B\_{\mathsf{sa}},\tag{C.623}$$

where *AW*∗(*e*, *f*) is the AW\*-algebra generated by *e*, *f* , and the unit 1*A*, which in particular preserves all *Jordan triple products*

$$\{a, b, c\} = (a \circ b) \circ c + a \circ (b \circ c) - b \circ (a \circ c), \tag{C.624}$$

which in terms of the usual operator product equals <sup>1</sup> <sup>2</sup> (*abc*+*cba*). This implies

$$\mathbb{N}((1\_A - 2e)f(1\_A - 2e)) = (1\_B - 2\mathbb{N}(e))\mathbb{N}(f)(1\_B - 2\mathbb{N}(e)),\tag{C.625}$$

which (in the second major step of the proof, after the application of Gleason's Theorem) is necessary and sufficient for ϕ to extend to a Jordan isomorphism. -

The structure of Jordan isomorphisms may be inferred from the following remarkable result, in which a linear map ϕ : *A* → *B* between C\*-algebras is called an *anti-homomorphism* of ϕ(*a*∗) = ϕ(*a*)∗ as usual, but ϕ(*ab*) = ϕ(*b*)ϕ(*a*).

Theorem C.175. *If* ϕ : *A*sa → *B*(*H*)sa *is a Jordan homomorphism (where A is a C\* algebra and H is a Hilbert space), there exist three mutually orthogonal projections e*1*, e*2*, e*<sup>3</sup> *in the center* ϕ(*A*)∩ϕ(*A*) *of the von Neumann algebra* ϕ(*A*)*, such that:*


*If in addition a* → ϕC(*a*)*e*<sup>1</sup> *is not an anti-homomorphism and a* → ϕC(*a*)*e*<sup>2</sup> *is not a homomorphism, then e*1*, e*2*, and e*<sup>3</sup> *are uniquely determined by these conditions.*

Like the previous theorem, the proof of this one exceeds the scope of this book.

Corollary C.176. *Let* J : *B*(*H*)sa → *B*(*H*)sa *be a Jordan isomorphism. Then* J<sup>C</sup> : *B*(*H*) → *B*(*H*) *is either a homomorphism or an anti-homomorphism of C\*-algebras.*

*Proof.* The center of *B*(*H*) is trivial, so either *e*<sup>1</sup> = 1*<sup>H</sup>* or *e*<sup>2</sup> = 1*H*. -

The *pure state space P*(*A*) = ∂*eS*(*A*) is the extreme boundary of the state space *S*(*A*). According to the Krein–Milman Theorem B.50, *P*(*A*) is not empty, and

$$S(A) = (\textsf{cood}\_{\epsilon}P(A))^{-},\tag{\text{C.626}}$$

see (B.165) for notation. In order to recover *S*(*A*) from *P*(*A*), the latter obviously needs more structure than just that of a set. First, it inherits the *w*∗-topology from *A*∗, but it turns out that we need to equip *P*(*A*) with the more refined *w*∗*-uniformity*.

In general, a *uniform structure* on a set *X* (also called an *entourage uniformity*) is a nonempty filter U on *X* × *X* (i.e.,a collection U ⊂ P(*X* × *X*) of subsets of *X* × *X* such that *U* ∈ U and *U* ⊂ *V* imply *V* ∈ U , and *U* ∈ U and *V* ∈ U imply *U* ∩*V* ∈ U ) satisfying the following conditions:


$$V^2 = \{(\mathbf{x}, \mathbf{z}) \mid \exists \mathbf{y} \in X : (\mathbf{x}, \mathbf{y}) \in V, (\mathbf{y}, \mathbf{z}) \in V\}.\tag{C.627}$$

A set with a uniformity is called a *uniform space*. If *X* and *Y* are uniform spaces, a function *<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> *<sup>Y</sup>* is *uniformly continuous* if *<sup>f</sup>* <sup>−</sup>1(*V*) <sup>∈</sup> <sup>U</sup>*<sup>X</sup>* whenever *<sup>V</sup>* <sup>∈</sup> <sup>U</sup>*<sup>Y</sup>* .

The *w*∗-unformity U*w*<sup>∗</sup> on *A*∗, where *A* is any Banach space, is the smallest one containing all subsets of the type

$$\{ (\mathfrak{\boldsymbol{\varphi}}, \mathfrak{\boldsymbol{\varphi}}') \in A \times A : |\mathfrak{\boldsymbol{\varphi}}(a) - \mathfrak{\boldsymbol{\varphi}}'(a)| < \mathfrak{\boldsymbol{\varepsilon}} \},\tag{C.628}$$

where *a* ∈ *A* and ε > 0; this implies that *U* ∈ U*w*<sup>∗</sup> iff *U* contains some such subset.

Second, *P*(*A*) carries a natural *transition probability*, cf. Definition 1.17 and (2.43). For ω,ω ∈ *P*(*A*), this function τ : *P*(*A*)×*P*(*A*) → [0,1] is defined by

$$\pi(a\mathfrak{o},\mathfrak{o}') = \inf\{\mathfrak{o}(a) \mid a \in A, 0 \le a \le 1\_A, \mathfrak{o}'(a) = 1\}.\tag{C.629}$$

This definition, and the following result, are valid even if *A* has no unit.

Proposition C.177. *Let A be C\*-algebra and define* τ *by* (C.629)*. Then*

$$
\pi(\mathfrak{o}, \mathfrak{o}') = 1 - \frac{1}{4}||\mathfrak{o} - \mathfrak{o}'||^2,\tag{C.630}
$$

*and the following dichotomy applies:*

*1. If* ω *and* ω *are equivalent (in the sense that the corresponding* GNS*-representations* πω *and* πω *are unitarily equivalent), so that we may assume that the associated cyclic vectors* Ωω *and* Ωω *lie in the same Hilbert space, we have*

$$\pi(\mathfrak{o}\mathfrak{o},\mathfrak{o}') = \operatorname{Tr}\left(e\_{\mathfrak{Q}\_{\mathfrak{o}}}e\_{\mathfrak{Q}\_{\mathfrak{o}'}}\right) = \left|\langle\mathfrak{Q}\_{\mathfrak{o}},\mathfrak{Q}\_{\mathfrak{o}'}\rangle\right|^2. \tag{\text{C.631}}$$

*2. If* ω *and* ω *are inequivalent (in that* πω *and* πω *are inequivalent), then*

$$
\pi(\alpha, \alpha') = 0.\tag{C.632}
$$

*Proof.* We first show that (C.630) yields (C.631) and (C.632). In the first case,

$$\begin{split} ||\boldsymbol{a} - \boldsymbol{a}\prime|| &= \sup\{ |\boldsymbol{\alpha}(\boldsymbol{a}) - \boldsymbol{\alpha}\prime(\boldsymbol{a})|, \boldsymbol{a} \in A, ||\boldsymbol{a}|| = 1 \} \\ &= \sup\{ |\langle \boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}}, \pi\_{\boldsymbol{\alpha}\prime}(\boldsymbol{a}) \boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}} \rangle - \langle \boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}}, \pi\_{\boldsymbol{\alpha}\prime}(\boldsymbol{a}) \boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}\prime}} \rangle |, \boldsymbol{a} \in A, ||\boldsymbol{a}|| = 1 \} \\ &= \sup\{ |\mathrm{Tr}\,( (e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}} - e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}}}) \pi\_{\boldsymbol{\alpha}\prime}(\boldsymbol{a})) |, \boldsymbol{a} \in A, ||\boldsymbol{a}|| = 1 \} \\ &= \sup\{ |\mathrm{Tr}\,( (e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}} - e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}}}) \boldsymbol{a}) |, \boldsymbol{a} \in \pi\_{\boldsymbol{\alpha}\prime}(\boldsymbol{A}), ||\boldsymbol{a}|| = 1 \} \\ &= \sup\{ |\mathrm{Tr}\,( (e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}} - e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}}}) \boldsymbol{a}) |, \boldsymbol{a} \in B(H\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}), ||\boldsymbol{a}|| = 1 \} \\ &= ||e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}}} - e\_{\boldsymbol{\Omega\_{\boldsymbol{\alpha}\boldsymbol{\alpha}}}}||\_1, \end{split} \tag{C.633}$$

where ·<sup>1</sup> is the trace norm on *B*1(*H*ω). In the fifth step we used the fact that the map *a* → Tr(*ba*)is σ-weakly continuous for any *b* ∈ *B*1(*H*ω), so that we may replace the supremum over *a* ∈ πω(*A*) by the supremum over *a* in the σ-weak closure of πω(*A*) which by the Theorem C.130 is πω(*A*), which in turn is *B*(*H*ω) because πω(*A*) is irreducible (since ω is pure, cf. Theorem C.90). The last step then follows from Theorem B.146. To compute the last expression in (C.633), we assume that Ωω and Ωω are not proportional (if they are, then ω = ω , so that (C.630) reduces to 1 = 1, and hence holds). We may then work in the 2-dimensional Hilbert space spanned by Ωω ≡ (1,0) and Ωω = (*c*1, *c*2), with |*c*1| <sup>2</sup> <sup>+</sup>|*c*2<sup>|</sup> <sup>2</sup> = 1. In that case,

$$\left(\left(e\_{\Omega\_{\mathfrak{w}}} - e\_{\Omega\_{\mathfrak{w'}}}\right)^2 = |c\_2|^2 \cdot 1\_2; \tag{C.634}$$

$$|e\_{\mathcal{Q}\_{\mathcal{Q}}} - e\_{\mathcal{Q}\_{\mathcal{Q}'}}| = \sqrt{(e\_{\mathcal{Q}\_{\mathcal{Q}}} - e\_{\mathcal{Q}\_{\mathcal{Q}'}})^2} = |c\_2| \cdot 1\_2;\tag{C.635}$$

$$||e\_{\Omega a} - e\_{\Omega\_{a'}}||\_1 = \text{Tr}\left(|e\_{\Omega a} - e\_{\Omega\_{a'}}|\right) = 2|c\_2|. \tag{C.636}$$

Using (C.633), this gives

$$1 - \frac{1}{4}||\mathbf{o} - \mathbf{o}'||^2 = 1 - \frac{1}{4}||e\_{\mathfrak{Q}\_{\mathbf{o}}} - e\_{\mathfrak{Q}\_{\mathbf{o}'}}||\_1^2 = 1 - |c\_2|^2 = |c\_1|^2 = |\langle \mathbf{Q}\_{\mathbf{o}}, \mathbf{Q}\_{\mathbf{o}'} \rangle|^2. \tag{C.637}$$

To deal with the second case, we use the following version of Schur's Lemma:

Lemma C.178. *Let* πω *and* πω *be irreducible representations of some C\*-algebra A, and let w* : *H*<sup>ω</sup> → *H*<sup>ω</sup> *be an intertwiner, i.e., a bounded linear map that satisfies*

$$
\mathfrak{w}\pi\_{\mathfrak{w}}(a) = \pi\_{\mathfrak{a}'}(a)\mathfrak{w}\ (a \in A). \tag{C.638}
$$


*Proof.* The proof is the same as for group representations: taking the adjoint of (C.638), it follows that *w*∗*w* ∈ πω(*A*) and *ww*<sup>∗</sup> ∈ πω(*A*) , so by Theorem C.90 (i.e. the mother of all Schur's lemma's) we have *w*∗*w* = λ · 1*H*<sup>ω</sup> and *ww*<sup>∗</sup> = μ · 1*H*<sup>ω</sup> , for some <sup>λ</sup>,<sup>μ</sup> <sup>∈</sup> <sup>R</sup><sup>+</sup> (since *<sup>w</sup>*∗*<sup>w</sup>* and *ww*<sup>∗</sup> are positive operators). Moreover, since *w* = λ*ww*∗*w* = μ*w*, in fact we have λ = μ whenever *w* = 0. If λ > 0, then the operator (λ)−1/2*<sup>w</sup>* : *<sup>H</sup>*<sup>ω</sup> <sup>→</sup> *<sup>H</sup>*<sup>ω</sup> is a unitary intertwiner, so πω and πω are equivalent. If <sup>λ</sup> <sup>=</sup> 0, then *<sup>w</sup>*∗*<sup>w</sup>* <sup>=</sup> 0 and hence *<sup>w</sup>* <sup>=</sup> 0, since *w*∗*w* <sup>=</sup> *w*2. - Continuing the proof of (C.632), we form the direct sum

$$
\pi(A) = \pi\_{\mathfrak{a}\mathfrak{o}}(A) \oplus \pi\_{\mathfrak{a}'}(A);\tag{C.639}
$$

$$H = H\_{\mathfrak{o}} \oplus H\_{\mathfrak{o}'}.\tag{C.640}$$

The second case of Lemma C.178 then gives

$$(\pi\_{\mathfrak{o}\mathfrak{o}}(A) \oplus \pi\_{\mathfrak{o}\mathfrak{o}'}(A))' = \pi\_{\mathfrak{o}\mathfrak{o}}(A)' \oplus \pi\_{\mathfrak{o}\mathfrak{o}'}(A)',\tag{C.641}$$

whose right-hand side consist of operators λ · 1*H*<sup>ω</sup> ⊕ μ · 1*H*<sup>ω</sup> (λ,μ ∈ C), so that

$$\left(\pi\_{\mathfrak{o}}(A)\oplus\pi\_{\mathfrak{o}'}(A)\right)^{\prime\prime}=\pi\_{\mathfrak{o}}(A)^{\prime\prime}\oplus\pi\_{\mathfrak{o}'}(A)^{\prime\prime}=\mathcal{B}(H\_{\mathfrak{o}})\oplus\mathcal{B}(H\_{\mathfrak{o}'}).\tag{C.642}$$

Once again using Theorem C.130, a computation a la (C.633) therefore gives `

$$\begin{split} \|\boldsymbol{a}\boldsymbol{\sigma} - \boldsymbol{a}^{\prime}\| &= \sup \{ |\mathrm{Tr}\,((e\_{\Omega\_{\boldsymbol{\alpha}}} - e\_{\Omega\_{\boldsymbol{\alpha}'}})a)|, \boldsymbol{a} \in \mathcal{B}(H\_{\boldsymbol{\alpha}}) \oplus \mathcal{B}(H\_{\boldsymbol{\alpha}'}), \|\boldsymbol{a}\| = 1 \} \\ &= \sup \{ |\mathrm{Tr}\,(e\_{\Omega\_{\boldsymbol{\alpha}}}a) - \mathrm{Tr}\,(e\_{\Omega\_{\boldsymbol{\alpha}'}}a^{\prime})|, \boldsymbol{a} \in \mathcal{B}(H\_{\boldsymbol{\alpha}}), \boldsymbol{a}^{\prime} \in \mathcal{B}(H\_{\boldsymbol{\alpha}'}), \|\boldsymbol{a} \oplus \boldsymbol{a}^{\prime}\| = 1 \} \\ &= \sup \{ |\mathrm{Tr}\,(e\_{\Omega\_{\boldsymbol{\alpha}}}a)|, \boldsymbol{a} \in \mathcal{B}(H\_{\boldsymbol{\alpha}}), \|\boldsymbol{a}\| = 1 \} \\ &+ \sup \{ |\mathrm{Tr}\,(e\_{\Omega\_{\boldsymbol{\alpha}'}}a^{\prime})|, \boldsymbol{a}^{\prime} \in \mathcal{B}(H\_{\boldsymbol{\alpha}'}), \|\boldsymbol{a}^{\prime}\| = 1 \} \\ &= ||e\_{\Omega\_{\boldsymbol{\alpha}}}||\_{1} + ||e\_{\Omega\_{\boldsymbol{\alpha}'}}||\_{1} = 1 + 1 = 2, \end{split} \tag{C.643}$$

since the trace may be computed in a basis of *H*<sup>ω</sup> ⊕*H*<sup>ω</sup> consisting of a basis of *H*<sup>ω</sup> and a basis of *H*<sup>ω</sup> , and *a*⊕*a* = max{*a*,*a* } for *a* ∈ *B*(*H*ω) and *a* ∈ *B*(*H*<sup>ω</sup>).

Finally, we prove that (C.629) and (C.630) coincide. If ω and ω are equivalent,

$$\pi(a\mathfrak{o},a\mathfrak{o}') = \inf \{ \operatorname{Tr} \left( e\_{\mathfrak{Q}\_{\mathfrak{o}}} \pi\_{\mathfrak{o}}(a) \right) \mid a \in A, 0 \le a \le 1\_A, \operatorname{Tr} \left( e\_{\mathfrak{Q}\_{\mathfrak{o}'}} \pi\_{\mathfrak{o}}(a) \right) = 1 \} \quad (\text{C.644})$$

and, as in (C.633), Theorem C.130 allows us to replace the infimum over *a* ∈ *A* by the one over *a* ∈ *B*(*H*ω). The claim then follows from Theorem 2.12 and eq. (C.631).

Similarly, if ω and ω are inequivalent, eq. (C.642) and Theorem C.130 give

$$\pi(a\bullet, a\bullet') = \inf \{ \operatorname{Tr} \left( e\_{\Omega\_{\mathfrak{a}}a} a \right) \mid a \in B(H\_{\mathfrak{a}\mathfrak{b}}) \oplus B(H\_{\mathfrak{a}'}), 0 \le a \le 1\_H, \operatorname{Tr} \left( e\_{\Omega\_{\mathfrak{a}'}a} a \right) = 1 \},$$

and notice that the infimum zero is reached by *a* = 0 · 1*H*<sup>ω</sup> ⊕1*H*<sup>ω</sup> . -

The final result of this appendix, then, is the "pure" counterpart of Theorem C.172:

Theorem C.179. *Let A and B be unital C\*-algebras. There is a bijective correspondence f* = ϕ<sup>∗</sup> *between Jordan isomorphisms* ϕ : *A*sa → *B*sa *and bijections f* : *P*(*B*) → *P*(*A*) *that preserve transition probabilities and are w*∗*-uniformly continuous along with their inverse. In particular,* ϕ : *A*sa → *A*sa *is a Jordan automorphism of A iff* ϕ<sup>∗</sup> : *P*(*A*) → *P*(*A*) *has the properties just stated for f .*

The proof of this theorem is far more difficult than Theorem C.172, so we omit it.

If *A* ∼= *C*(*X*) and *B* ∼= *C*(*Y*) are commutative, we obtain a variation on Corollary C.22 featuring *uniform* homeomorphisms. Also, we see from Wigner's Theorem 5.4.1 that for *A* = *B*(*H*) it is enough to consider normal pure states, in which case also the (uniform) continuity condition on *f* is superfluous.

#### Notes

As already mentioned in the Introduction, the theory of operator algebras on Hilbert spaces was created by von Neumann, partly in collaboration with his assistant Murray (von Neumann, 1930, 1931, 1938, 1940, 1949; Murray & von Neumann, 1936, 1937, 1943, reprinted in von Neumann, 1961). His motivation for doing so certainly included quantum mechanics, but also functional analysis, measure theory, ergodic theory, and representation theory, all of which fields in turn benefited from their interaction with operator algebras. Von Neumann (and Murray) studied what they called "rings of operators", which are now deservedly called von Neumann algebras. John von Neumann (1903–1957) was one of the greatest mathematicians in history, especially considering the totality of his oeuvre in pure and applied mathematics (including numerical mathematics, computer science, and mathematical economics). His work in mathematical physics, notably on the mathematical structure of quantum mechanics, in some sense forms a bridge between the two.

Von Neumann was a Hungarian prodigy; he wrote his first mathematical paper at the age of seventeen. Except for this first paper, his early work was in set theory and the foundations of mathematics. In the Fall of 1926, he moved to Gottingen to work ¨ with Hilbert. Around 1920, Hilbert had initiated his *Beweistheory*, an approach to the foundations of mathematics whose specific technical goals were not achieved because of Godel's work, but whose overall view of mathematics (i.e. as an activity ¨ whose correctness is to be established purely syntactically and whose meaning is a semantic matter to be distinguished from its syntax) still reigns. However, at the time that von Neumann arrived, Hilbert was also interested in quantum mechanics. Apart from his broad interest in general (mathematical) physics (for example, his Sixth Problem from 1900 called for the mathematical axiomatization of physics), Hilbert was specifically attracted to quantum mechanics because Gottingen was, next to ¨ Copenhagen, a leading center for research in this area. Indeed, Heisenberg's (1925) paper initiating quantum mechanics (at least in its preliminary guise of "matrix mechanics") was followed by the *Dreimannerarbeit ¨* of Born, Heisenberg, and Jordan (1926), and all three were in Gottingen at the time. Born was one of the few physi- ¨ cists of his day to be familiar with the concept of a matrix; in previous research he had even used infinite matrices. Born turned to his former teacher Hilbert for mathematical advice. Aided by his assistants Nordheim and von Neumann, Hilbert thus ran a seminar on the mathematical structure of quantum mechanics, and the three wrote a joint paper on the subject (which is now exclusively of historical value).

It was von Neumann (1927ab) who, at the age of 23, discovered the mathematical structure of quantum mechanics. In this process, he defined the abstract concept of a Hilbert space, which previously had only appeared in examples that went back to the work of Hilbert and his pupils on integral equations, spectral theory, and infinitedimensional quadratic forms. Hilbert's famous memoirs on integral equations had appeared between 1904 and 1906; in 1908, his student Schmidt had defined the space -<sup>2</sup> in the modern sense, and F. Riesz had studied the space of all continuous linear maps on -<sup>2</sup> in 1912. Various examples of *L*2-spaces had emerged around the same time (with hindsight, Hilbert himself mainly worked with the unit ball of -2).

Notes 769

However, the abstract notion of a Hilbert space was missing until von Neumann provided it. In particular, von Neumann saw that Schrodinger's wave functions were ¨ unit vectors in a Hilbert space of *L*<sup>2</sup> type, and that Heisenberg's observables were linear operators on a different Hilbert space, of -<sup>2</sup> type. A unitary transformation between these spaces provided the the mathematical equivalence between wave mechanics and matrix mechanics. Moreover, von Neumann developed the spectral theory of bounded as well as unbounded normal operators on a Hilbert space. This work culminated in his book *Mathematische Grundlagen der Quantenmechanik* (1932).

Despite the tremendous prestige of von Neumann, initially few mathematicians recognized the importance of his subsequent theory of operator algebras. For example, after a lecture by von Neumann on operator algebras in the weekly mathematics colloquium at Harvard sometime in the 1930s, G. H. Hardy, one of the leading mathematicians of his time, is reported to have said:1

"He is quite clearly a brilliant man, but why does he waste his time on this stuff?"

Fortunately, among those who did study operator algebras were Gelfand & Naimark (1943), who linked the subject to Gelfand's earlier work on (commutative) Banach algebras and in doing so created the theory of C\*-algebras. This, in turn, was picked up by Segal (1947ab), who thereby also restored the link with quantum theory.

A survey of von Neumann's mathematical work is given in Oxtoby et al (1958), which contains a biographical introduction by von Neumann's friend and colleague Ulam, and some of von Neumann's correspondence is collected in Redei (2005b), ´ which also contains a short mathematical biography. One of the most insightful documents about von Neumann is the rare manuscript Vonneumann (1987) by his brother Nicholas, of which the author got a copy from von Neumann's only PhD student Israel Halperin, who visited Cambridge on a peace mission in the early 1990s.<sup>2</sup> Politically, von Neumann was a controversial figure because of his enthusiastic contributions to nuclear weapons and the arms race between the USA and the Soviet Union; see Heims (1980) and Macrae (1992) for different perspectives on this. A substantial scholarly scientific biography of von Neumann remains to be written.

The history of operator algebras (i.e. von Neumann algebras and C\*-algebras, which terms were probably introduced by Dieudonne and Segal, respectively) has ´ been described in Kadison (1982), Doran & Belfi (1986), and Doran (1994).

Leading textbooks on operator algebras, written by some of the original contributors, are Neumark (1968), Sakai (1971), Dixmier (1977, 1981), Pedersen (1979), Kadison & Ringrose (1983, 1986), and Takesaki (2002, 2003a, 2003b). See also Murphy (1990), Li (1992), Davidson (1996), Blackadar (2006), and the remarkable lectures on von Neumann algebras by algebraic topologist Lurie (2011). Connes (1994), written by arguably the greatest contemporary mathematician working in operator algebras, also provides innumerable fascinating insights into the subject.

<sup>1</sup> Reported by G.D. Birkhoff (who overheard Hardy saying this) to his son, Garrett Birkhoff, who in turn mentioned it to G.C. Rota, who wrote it down in the Introduction to Stern (1991).

<sup>2</sup> According to Rhodes (1996, pp. 245–246), Halperin was a spy for the Soviet Union, although his evidence seems limited to the fact Halperin was arrested in 1946 *suspected* of espionage, having Klaus Fuchs in his address book.

#### §C.1.Basic definitions and examples

As in the notes to the previous appendix, we only comment on results whose origins are less well known or which are less standard by themselves, the rest belonging to the foundations of the field as described in the textbooks just mentioned. Once again, for this reason not all sections in this appendix come with notes.

#### §C.2. Gelfand isomorphism

The implication ω ∈ Σ(*A*) ⇒ ω(*a*) ∈ σ(*a*) (*a* ∈ *A*) in the proof of Lemma C.9 also holds in the oppositie direction (given that *A* is a Banach algebra with unit and ω : *A* → C is linear); this is the *Gleason-Kahane-Zelazko Theorem* (Sourour, 1994). A recent monograph about *C*(*X*) is Groenewegen & van Rooij (2016), following up on earlier books like Semadeni (1971) and Gillman & Jerison (1976).

## §C.3.Gelfand duality

Proposition C.19 is due to Gelfand & Kolmogorov (1939). In the spirit of the proof of the Stone–Weierstrass Theorem B.51 in §B.10, let us give an alternative proof of this proposition (Simon, 2011), which is based on Proposition C.14 and Corollary B.17. These identify Σ(*C*(*X*)) with the set ∂*eM*<sup>+</sup> <sup>1</sup> (*X*) of extreme completely regular probability measures on *X*, provided we identify the latter with the corresponding functionals on *C*(*X*), as in (B.39). That is, we must prove that the map *x* → δ*<sup>x</sup>* (i.e., the Dirac measure at *x*, which, seen as a functional on *C*(*X*), is just the evaluation map ev*x*) is a bijection.

*Proof.* We first show that a measure <sup>μ</sup> <sup>∈</sup> <sup>∂</sup>*eM*<sup>+</sup> <sup>1</sup> (*X*) must satisfy μ(*A*) = 1 or μ(*A*) = 0 for any *A* ∈ Σ. For if there is some *C* ∈ Σ for which 0 < μ(*C*) < 1, we have a nontrivial convex decomposition μ = *t*μ<sup>1</sup> + (1−*t*)μ2, namely *t* = μ(*C*), μ1(*A*) = μ(*A*|*C*) (i.e., μ(*A*∩*C*)/μ(*C*)), and μ2(*A*) = μ(*A*\*C*)/μ(*X*\*C*). From this, we show that supp(μ) is a point. Indeed, if both *x* and *y* = *x* would lie in supp(μ), we could separate these with disjoint open sets *x* ∈ *U* and *y* ∈ *V*. This would leave four (im)possibilities:


Thus supp(μ) = {*x*} for some *<sup>x</sup>* <sup>∈</sup> *<sup>X</sup>*, i.e., <sup>μ</sup> <sup>=</sup> <sup>δ</sup>*x*, so that <sup>∂</sup>*eM*<sup>+</sup> <sup>1</sup> (*X*) ⊆ *X*. Finally, we also have *<sup>X</sup>* <sup>⊆</sup> <sup>∂</sup>*eM*<sup>+</sup> <sup>1</sup> (*X*), since δ*<sup>x</sup>* = *t*μ<sup>1</sup> + (1−*t*)μ<sup>2</sup> forces

$$\text{supp}(\mu\_1) = \text{supp}(\mu\_2) = \{\mathbf{x}\},\tag{\text{C.645}}$$

and hence μ<sup>1</sup> = μ<sup>2</sup> = δ*x*. -

In the unital/compact case, categorical Gelfand duality was first established in Negrepontis (1969, 1971), and was reproved in a different way by Johnstone (1982). Our proof of is taken from Landsman (2004), with some improvements in the nonunital case due to Brandenburg (2015), but it should be considered "folklore".

In the smooth case, Corollary C.22 is often called *Milnor's exercise*. The result even holds without the second countability assumption on the manifold *X*, but with a completely different proof (Mrcun, 2005). See also Burtscher (2009). ˆ

Notes 771

#### §C.6. C\*-algebras without unit: commutative case

For proper maps see e.g. Bourbaki (1989), §I.10.

## §C.10. Hilbert C\*-modules and multiplier algebras

The theory of Hilbert C\*-modules goes back to Kaplansky, Paschke, and Rieffel. See Lance (1995) and Raeburn & Williams for textbook coverage, and Landsman (1998a) for applications to mathematical physics (e.g. constrained quantization).

#### Theorem C.76 is due to An Huef, Raeburn, & Williams (2010).

The *Cohen–Hewitt Factorization Theorem* a la Fell & Doran (1988), Theorem ` V.9.2, adapted to C\*-algebras, states that if *A* and *B* are C\*-algebras and α : *A* → *B*) is a homomorphism, then {α(*a*)*b* | *a* ∈ *A*,*b* ∈ *B*} is a closed linear subspace of *B*. Consequently, if α is nondegenerate, then each element *c* ∈ *B* factors as *c* = α(*a*)*b*. In particular, taking *B* = *A* and α to be the identity, we see that Lemma C.47 may be sharpened to the claim that any *c* ∈ *A* takes the form *c* = *ab* for suitable *a*,*b* ∈ *A*.

#### §C.11. Gelfand topology as a frame

Our treatment of frames and locales has been borrowed from Mac Lane & Moerdijk (1992), where also the details of the proof of Theorem C.80 may be found. See also Picado & Pultr (2012). Hereditary subalgebras are discussed e.g. in Pedersen (1979) and Blackadar (2006).

The fact that *H*(*A*) forms a complete lattice was noted by Akemann & Bice (2014), who also pursued the analogy with open sets, though not in a frame-theoretic setting. The theory is still disappointing in various ways, most notably in the fact that *H*(*A*) fails to be a frame unless *A* is commutative. Also, Theorem C.86 has (so far) been proved by conventional means, i.e., via the Gelfand isomorphism; it would be preferable to prove it purely algebraically (and if possible constructively).

From a localic point of view, the Gelfand transform ˆ*a* : Σ(*A*) → C of *a* ∈ *A* should primarily be described as the corresponding frame map ˆ*a*−<sup>1</sup> : <sup>O</sup>(C) <sup>→</sup> <sup>O</sup>(Σ(*A*)), and hence, using Corollary C.84, as a frame map

$$\hat{a}^{-1}: \mathcal{O}(\mathbb{C}) \to H(\mathbb{A}).\tag{\mathbb{C}.646}$$

Denoting the hereditary subalgebra generated by *a* by *Ha*, i.e., the closure of *a* · *A*, for *U* ∈ O(C)) we obtain a nice formula whose use remains to be established:

$$\hat{a}^{-1}(U) = \bigcap\_{z \in \mathbb{C}/U} H\_{a-z}.\tag{C.647}$$

A direct proof of the last claim of Proposition C.82 uses the property *H*(*A*) = *I*(*A*) (in the commutative case), the identification of *I* ∧*J* with (*IJ*)<sup>−</sup> (i.e., the closure of the linear span of all *ab*, *a* ∈ *I*, *b* ∈ *J*, which follows by taking an approximate unit in *I* or *J*), and the identification of *S* with the closure of the linear span of +*S*.

#### §C.13. Tensor products of Hilbert spaces and C\*-algebras

For the proof of (C.248) see Reed & Simon (1972), Theorem II.10.

For tensor products of C\*-algebra we mainly relied on Lance (1982), Li (1992), Wegge-Olsen (1993), and Takesaki (2002), by one of the founders of the theory.

Tensor products of Banach spaces and Hilbert spaces were first studied by Schatten (1946) and Schatten & von Neumann (1946, 1948). The subject was subsequently taken up by Grothendieck (1955) for locally convex spaces, and hence involves two of the greatest mathematicians of the twentieth century. Nuclearity of C\*-algebras is a vast and important field, to which Takesaki (2003) is a good introduction.

Yet another expression for the maximal C\*-norm on *A*⊗*B* arises if we say that two representations π*<sup>A</sup>* : *A* → *B*(*H*) and π*<sup>B</sup>* : *B* → *B*(*H*) on the same Hilbert space *H commute* if π*A*(*a*)π*B*(*b*) = π*B*(*b*)π*A*(*a*) for all *a* ∈ *A* and *b* ∈ *B*. Such a pair defines a representation π*<sup>A</sup>* ⊗π*<sup>B</sup>* of *A*⊗*B* by

$$
\pi\_{\mathcal{A}} \otimes \pi\_{\mathcal{B}}(c) = \sum\_{i} \pi\_{\mathcal{A}}(a\_i) \otimes \pi\_{\mathcal{B}}(b\_i), \tag{C.648}
$$

which makes sense because (*a*,*b*)7→ π(*a*)π(*b*) is bilinear and hence (by universality of ⊗) factors through *A*⊗*B*. This gives a third formula for k · kmax, namely

$$||c||\_{\max} = \sup \{ ||\mathfrak{x}\_{\mathcal{A}} \otimes \mathfrak{x}\_{\mathcal{B}}(c)||\_{B(H\_{\mathcal{A}} \overline{\otimes} H\_{\mathcal{B}})} \},\tag{C.649}$$

where π*<sup>A</sup>* and π*<sup>B</sup>* run through all *commuting* representations of *A* and *B*. Indeed, the restrictions of any representation of *A*⊗*B* to *A* and *B* define commuting representations, so that although at first sight the expression (C.649) appears to majorize (C.265), it must be equal to it in view of the equality of (C.265) and (C.263).

The name *projective tensor product* for *A*⊗ˆ max*B*, where *A* and *B* are C\*-algebras, is actually confusing, since if *A* and *B* are regarded as Banach algebras, their projective tensor product is usually defined as the completion of *A*⊗*B* in the norm

$$||c||\_{\text{proj}} = \inf \left\{ \sum\_{l} ||a\_{l}|| ||b\_{l}||, c = \sum\_{l} a\_{l} \otimes b\_{l} \right\},\tag{C.650}$$

cf. (C.259), which is defined for any two Banach algebras *A* and *B*. This may not be a C\*-norm, and hence *A*⊗ˆ proj*B* may not be a C\*-algebra. However, for any Banach algebra *C* with involution, one may canonically construct a C\*-algebra C\* (sic) and a homomorphism ϕ : *C* → *C* <sup>∗</sup> of involutive Banach algebras, with the universal property that for any morphism β : *C* → *D*, where *D* is a C\*-algebra, there is a unique homomorphism β 0 : *C* <sup>∗</sup> → *D* of C\*-algebras such that β = β <sup>0</sup> ◦ϕ. This C\* algebra C\*, which by the usual argument is unique up to isomorphism, is called the *C\*-envelope* of *C*. An explicit construction is obtained by completing *C* in the norm

$$\|\|c\|\| = \sup\{\|\pi(c)\|\},\tag{C.651}$$

where the supremum runs over all representations of *C* on Hilbert spaces; it is finite since kπ(*c*)k ≤ k*c*k for each *c* ∈ *C*, see Dixmier (1977), §1.3.7 and §2.7. It is easy to see that k · kproj is a cross-norm on *A*⊗*B*, and that one has a bijective correspondence between representations of *A*⊗*B* that satisfy kπ(*a*⊗*b*)k ≤ k*a*kk*b*k and representations of *A*⊗ˆ proj*B*. The point, then, is that one has *A*⊗ˆ max*B* = (*A*⊗ˆ proj*B*) ∗ .

Notes 773

C\*-algebras (with homomorphism) and ⊗ˆ max form a *monoidal category* (also called a *tensor category*), with commutative C\*-algebras as a full subcategory CCA. The map *X* 7→*C*0(*X*) then defines a duality *as monoidal categories* between the category LCHp of locally compact Hausdorff spaces and proper continuous maps (with cartesian product as a tensor product) and the category CCAn of commutative C\* algebras and nondegenerate homomorphisms (with its unique C\*-algebraic tensor product, for example realized as ⊗ˆ max). Cf. Theorem C.45. See Hofmann (1970).

#### §C.14. Inductive limits and infinite tensor products of C\*-algebras

For inductive limits of C\*-algebras see in particular Sakai (1971); they were originally a Japanese invention (Takeda). Infinite tensor products of operator algebras (which partly motivated inductive limits) go back to von Neumann (1938). Bounded monotone nets converge under very general conditions; see McArthur (1970).

#### §C.15. Gelfand isomorphism and Fourier theory

For details on the Haar measure and for the proof of local compactness of *G*ˆ see Weil (1965), §27. Our approach to the Fourier transform is largely taken from Deitmar & Echterhoff (2009), where complete proofs may be found (though we sometimes followed a slightly different approach). In particular, these authors introduced the Banach spaces *C* ∗ 0 (*G*) and *C* ∗ 0 (*G*ˆ), whose use forms a marked improvement over older and less elegant treatments, as in e.g. Rudin (1962) or Folland (1995).

Eq. (C.379) is often called *Plancherel's Theorem*.

We may add a third entry to the 'symmetric' isomorphisms (C.379) - (C.380). The *Bruhat space* S (*G*) of rapidly decreasing functions on *G* is defined by

$$\begin{aligned} A(G) &= \{ f \in L^\infty(G) \mid \exists K \in \mathcal{K}(G) \forall n > 0 \, \exists C\_n > 0 \, \forall k > 0 : \| f\_{|G|K^k} \|\_{\infty} \le C\_n k^{-n} \}; \\ \mathcal{J}^\ell(G) &= \{ f \in L^\infty(G) \mid f \in A(G), \hat{f} \in A(\hat{G}) \}. \end{aligned}$$

For *G* = R this recovers the usual test functions S (R) (cf. Definition 5.64), where the condition *f* ∈ *A*(R) gives rapid decrease whereas ˆ*f* ∈ *A*(R) gives smoothness. Pontryagin duality then yields an isomorphism S (*G*) ∼= S (*G*ˆ) (Osborne, 1975).

The author originally learnt the SNAG-Theorem from Barut & Rac¸ka (1977), whose proof (due to K. Maurin) is quite different; the argument given above was inspired by the treatment of projection-valued spectral measures in Conway (2007, Ch. 9, §1), who calls them *spectral measures*. Conway also proves our Theorem C.113 as his Theorem 1.14, albeit for the case where *X* is compact; passage to the locally compact case may be done through unitization, as in §C.6. The need for π to be non-degenerate may then be traced back to (our) Lemma C.43.

#### §C.16. Intermezzo: Lie groupoids

For introductions to Lie groupoids see Moerdijk & Mrcun (2003) or Mackenzie ˇ (2005), who also described the link with symplectic geometry. For their use in noncommutative geometry and mathematical physics cf. Connes (1994) and Landsman (1998a, 2006b), respectively. The tangent groupoid was invented by Connes, with further contributions by Hilsum & Skandalis (1987), Weinstein (1989) and Landsman (1998a). See also Connes (1994), Landsman (2003), Higson (2010), and van Erp (2010) for applications of the tangent groupoid to index theory.

#### §C.17. C\*-algebras associated to Lie groupoids

C\*-algebras associated to locally compact groupoids (with Haar system) were first studied in detail by Renault (1980). Originally in the setting of foliation theory, the Lie (i.e. smooth) case was pioneered by Connes (1994), who noted in particular that Lie groupoids carry an intrinsic Haar system, and gave many interesting examples. The uniqueness of *C* ∗ (*G*) for Lie groupoids *G*, i.e., the independence of the underlying left Haar system (up to isomorphism) is proved in Paterson (1999).

#### §C.18. Group C\*-algebras and crossed product algebras

The *locus classicus* is Pedersen (1979), but Williams (2007) may even be better.

#### §C.19. Continuous bundles of C\*-algebras

The bundles studied in this section were originally introduced by Fell (1961) and their theory was further developed by Dixmier & Douady (1963); see also Dixmier (1977), Fell & Doran (1988), and, for a modern treatment, Raeburn & Williams (1998). Lemma C.125 was part of Dixmier's definition of a continuous field of C\* algebras, before it was recast into the rather more appealing Definition C.121 by Kirchberg & Wassermann (1995) and Blanchard (1996). Theorem C.123 is due to Landsman & Ramazan (2001); see also Landsman (1998a) for a detailed discussion. Aastrup, Nest, & Schrohe (2006) discuss applications to manifolds with boundary.

#### §C.20. von Neumann algebras and the σ-weak topology

There are many other topologies on von Neumann algebras, se e.g. Takesaki (2002), Chapter II. In any case, we only scratch the surface of the subject.

#### §C.21.Projections in von Neumann algebras

The first part of the proof of Theorem C.141 is taken from Redei (1998), Prop. ´ 4.16. The remainder is adapted from Heunen, Landsman, & Spitters (2012). The details of the proof of Theorem C.140 may be found in Takesaki (2002), Thm. III.1.18; see also Dixmier (1981), Ch. 7 and Lurie (2011), lectures 13–17.

#### §C.23. Classification of hyperfinite factors

This material, which is a high point in modern mathematics, is explained in great detail in Takesaki (2003ab). See also Wright (1989) for the uniqueness of the hyperfinite III<sup>1</sup> factor. In his review MR1030046 (91a:46059) of the latter book for *Mathematical Reviews* in 1991, E. Størmer wrote:

'At the time of writing this review, by far the deepest and most difficult proof in von Neumann algebra theory is the one of Connes and Haagerup on the uniqueness of the injective factor of type III<sup>1</sup> with separable predual.'

The applications of C\*-algebras and von Neumann algebras to quantum field theory are reviewed in Haag (1992), where the identification of the unique hyperfinite III<sup>1</sup> factor with local algebras of observables may be found in §V.6. This book also explains the relationship between Tomita–Takesaki theory and quantum statistical mechanics, as do Bratteli & Robinson (1981). It should be mentioned that the Tomita–Takesaki theory, including the modular group (i.e. of time translations) has a classical analogue in Poisson geometry (Weinstein, 1997), which somewhat softens the spectacular claim by Connes & Rovelli (1994) that time has a quantummechanical (or non-commutative) origin related to thermodynamics.

#### §C.24. Other special classes of C\*-algebras

The classic reference on AW\*-algebras and Rickart C\*-algebras is Berberian (1972). For monotone complete C\*-algebras see the monograph by Saito & Mait- ˆ land Wright (2015b). Real rank zero was introduced by Brown & Pedersen (1991), who also proved that the definition of real rank zero in the main text may be replaced by an equivalent property that is often taken as the definition:

Proposition C.180. *Let A be a unital C\*-algebra. Then* rr(*A*) = 0 *iff the set of selfadjoint elements with finite spectrum is dense in A*sa*.*

See Davidson (1996), Theorem V.7.3, for a streamlined proof.

Scattered C\*-algebras were independently introduced by Jensen (1977) and Huruya (1978). The results in the main text are due to Kusuda (2011).

Theorem C.167.1 should be obvious. No. 2 is due to Kusuda (2011), no. 3 may be found in Takesaki (2002), §III.1, no. 4 is (a restatement of) Theorem 2.3.7 in Saito & Maitland Wright (2015b), no. 5 is Theorem 1.7.1 in Berberian (1972), no. ˆ 6 is Theorem 1.8.1 in the same reference, no. 7 is from Saito & Maitland Wright ˆ (2015a), and finally no. 8 may be found in Blackadar (1994), §6.1.3.

Theorem C.169.1 is Exercise 4.6.12 in Kadison & Ringrose (1983); it should be hidden from students that the AMS published two volumes with the answers to all their exercises! No. 2 is in Kusuda (2011), no. 3 is in Pedersen (1972), no. 4 is (a restatement of) Theorem 8.2.5 in Saito & Maitland Wright (2015b), and no. ˆ 5 easily follows from Corollary 2.7 in Saito & Maitland Wright (2015a). See also ˆ Lindenhovius (2016), where results of this kind are used to study the invariant C (*A*).

#### §C.25. Jordan algebras and (pure) state spaces of C\*-algebras

Theorem C.172 is Corollary 4.20 in Alfsen & Shultz (2001), based on Kadison (1951). See also Roberts & Roepstorff (1969). Theorem C.174 is due to Hamhalter (2015); the second step in the proof had been given earlier by Heunen & Reyes (2014). A complete proof of Lemma C.173 may be found in Bratteli & Robinson (1997), Theorem 3.2.3. In particular, Kadison's inequality is Proposition 3.2.4 in the same book. Theorem C.175 is the culmination of a long chain of argument, starting with Jacobson & Rickart (1950) and ending with Thomsen (1982). See also Bratteli & Robinson (1987), Theorem 3.2.3.

The formula (C.629) was proposed by Mielnik (1968, 1969). Otherwise, case 1 of Proposition C.177 is due to Roberts & Roepstorff (1969), who state case 2 without proof, referring to Glimm & Kadison (1960). Theorem C.179 is due to Shultz (1982). A completely different proof of the last claim, based on a reconstruction of *A* from *P*(*A*), appears in Landsman (1998a), §I.3. Both authors add further structure to *P*(*A*) to make it an invariant for *A* as a C\*-algebra, viz. an orientation and a Poisson structure, respectively. The notion of an orientation was originally introduced by Alfsen & Shultz in order to make *S*(*A*) a complete invariant for *A*; see their final work Alfsen & Shultz (2001, 2003).

## Appendix D Lattices and logic

In this appendix we collect some basic material from the theory of lattices, including Stone's representation theorem for Boolean lattices and the connection between Boolean (Heyting) lattices and classical (intuitionistic) propositional logic. In preparation for Appendix E, we also provide an introduction to first-order logic.

#### D.1 Order theory and lattices

One hopes that the reader has seen some of the following concepts before!

Definition D.1. *1. A* preorder *on a set X is a subset R* ⊂ *X* ×*X (i.e., a* relation *on X ), where we write x* ≤ *y or y* ≥ *x iff* (*x*, *y*) ∈ *R, such that x* ≤ *x, and x* ≤ *y and y* ≤ *z imply x* ≤ *z. A preorder is a* partial order *if in addition x* ≤ *y and y* ≤ *x imply x* = *y. A set with a partial order is called a* poset *(for* p*artially* o*rdered* set*). A a poset (or preorder) is* directed *if every pair* {*x*, *y*} *has an upper bound, i.e., some z for which x* ≤ *z and y* ≤ *z. A poset may have a largest element (also called a* top element*) denoted by* 1 *or* # *that satisfies x* ≤ # *for each x* ∈ *X, and/or a smallest element (also called a* bottom element*)* 0 *or* ⊥ *that satisfies* ⊥ ≤ *x for each x* ∈ *X. For x*,*z* ∈ *X, the* order interval [*x*,*z*] *is defined by*

$$\{\mathbf{x}, \mathbf{z}\} = \{\mathbf{y} \mid \mathbf{x} \le \mathbf{y} \le \mathbf{z}\}.\tag{\text{D.1}}$$

*An* atom *in a poset with* 0 *is an element x* = 0 *for which* [0, *x*] = {0, *x*}*.* In other words, *x* is an atom if *x* = 0, and 0 ≤ *y* ≤ *x* implies *y* = 0 or *y* = *x*. Thus *x* is an atom iff *x covers* 0, where we say that *x* covers *y* if *x* = *y* and [*y*, *x*] = {*y*, *x*}.

*A* homomorphism *between posets is a map that preserves* ≤*. As usual, an* isomorphism *is an invertible (i.e. bijective) homomorphism, such that the inverse also preserves the given structure (which, in this case, is* ≤*).*

Thus a bijection ϕ : *X* → *Y* between posets *X* and *Y* is an isomorphism when ϕ(*x*) ≤ ϕ(*y*) iff *x* ≤ *y*). In some cases, the inverse of a bijective homomorphism automatically preserves the relevant structure.

	- *an element x*∨*y, called the a* supremum *(*sup*) of x and y, such that*

$$
\mathbf{x} \le \mathbf{x} \lor \mathbf{y}; \tag{\text{D.2}}
$$

$$
\mathbf{y} \le \mathbf{x} \lor \mathbf{y},
\tag{\mathbf{D.3}}
$$

*and if x* ≤ *z and y* ≤ *z for some z, then x*∨*y* ≤ *z;*

• *an element x*∧*y, called the* infimum *(*inf*) of x and y, such that*

$$
\mathbf{x} \ge \mathbf{x} \land \mathbf{y}; \tag{\mathbf{D.4}}
$$

$$
\mathbf{y} \ge \mathbf{x} \land \mathbf{y},
\tag{\mathbf{D.S}}
$$

*and if x* ≥ *z and y* ≥ *z for some z, then x*∧*y* ≥ *z.*

*Suprema and infima are unique (if they exist).* Equivalently, a lattice may be defined algebraically (rather than order-theoretically) as a set equipped with two idempotent, commutative, and associative binary operations ∨,∧ that satisfy

$$
\mathbf{x} \vee (\mathbf{y} \wedge \mathbf{x}) = \mathbf{x}; \tag{\text{D.6}}
$$

$$\mathbf{x} \wedge (\mathbf{y} \vee \mathbf{x}) = \mathbf{x}.\tag{\text{D.7}}$$

The corresponding partial ordering is then defined by *x* ≤ *y* if *x*∧*y* = *x*.


$$\mathbf{x} \lor (\mathbf{y} \land \mathbf{z}) = (\mathbf{x} \lor \mathbf{y}) \land (\mathbf{x} \lor \mathbf{z});\tag{\text{D.8}}$$

$$
\mathbf{x} \wedge (\mathbf{y} \vee \mathbf{z}) = (\mathbf{x} \wedge \mathbf{y}) \vee (\mathbf{x} \wedge \mathbf{z}).\tag{\text{D.9}}
$$

*6. A* frame *is a complete lattice X which is "infinitely distributive" in that*

$$
\mathfrak{x} \wedge \bigvee \mathcal{S} = \bigvee \{ \mathfrak{x} \wedge \mathfrak{y}, \mathfrak{y} \in \mathcal{S} \},
\tag{\mathsf{D}.10}
$$

*for arbitrary subsets S* ⊂ *X.* A frame is clearly distributive. Frame homomorphism by definition preserve finite infima and arbitrary suprema.

*7. A* Heyting algebra *is a lattice X with top* # *and bottom* ⊥*, equipped with a map*  : *X* ×*X* → *X, called* (material) implication *that satisfies*

$$
\lambda x \le (\mathbf{y} - \mathbf{\bar{\boldsymbol{\tau}}} \mathbf{z}) \; \mathbf{j} \mathbf{f}'(\mathbf{x} \wedge \mathbf{y}) \le \mathbf{z}.\tag{\mathbf{D}.11}
$$

A Heyting algebra is automatically distributive. Negation is *defined* by

$$\neg \mathfrak{x} \equiv (\mathfrak{x} - \neg \star \bot). \tag{\text{D.12}}$$

A Heyting algebra is *complete* when it is complete as a lattice, in that arbitrary suprema (and hence also infima) exist. *In that case,* (D.10) *is satisfied, so that a complete Heyting algebra is a frame. Conversely, a frame becomes a complete Heyting algebra if we define the implication arrow by*

$$\{\mathbf{y} \dashrightarrow z = \bigvee\{\mathbf{x} \in X \mid \mathbf{x} \wedge \mathbf{y} \le z\}.\tag{\mathbf{D.13}}$$

However, frames and complete Heyting algebras drift apart as soon as morphisms are concerned, for although in both cases one requires maps to preserve the partial order, maps between Heyting algebras must preserve  rather than .

*8. An* orthocomplementation *on a lattice (poset) X with* 0 *and* 1 *is a map*

$$
\bot \colon X \to X, \ \ x \to x^{\bot}, \tag{D.14}
$$

*that satisfies:*

$$x^{\perp \perp} = x;\tag{\text{D.15}}$$

$$\mathbf{x} \le \mathbf{y} \text{ iff } \mathbf{y}^\perp \le \mathbf{x}^\perp;\tag{\mathbf{D.16}}$$

$$\mathbf{x} \wedge \mathbf{x}^{\perp} = \mathbf{0} \text{ (}\mathbf{x} \wedge \mathbf{x}^{\perp} \text{ exists and equals } \mathbf{0}\text{);}\tag{\mathbf{D}.17}$$

$$\mathbf{x} \lor \mathbf{x}^{\perp} = \mathbf{1} \text{ (}\mathbf{x} \lor \mathbf{x}^{\perp} \text{ exists and equals } I\text{)}. \tag{\mathbf{D.18}}$$

*A lattice (poset) with an orthocomplementation is called* orthocomplemented*. A* homomorphism *of orthocomplemented lattices (posets) is an lattice (order) morphism that also preserves the orthocomplementation, as well as 0 or 1.*

*9. A lattice is called* modular *if x* ≤ *z implies x*∨(*y*∧*z*)=(*x*∨*y*)∧*z for each y (i.e., if distributivity holds merely if x* ≤ *z).*

Hence modularity is a weakening of the following property:


$$\mathbf{x} \vee (\mathbf{x}^\perp \wedge \mathbf{z}) = \mathbf{z}.\tag{\text{D.19}}$$

*That is, the modularity axiom holds for y* = *x*<sup>⊥</sup> (note that *x*∨(*x*<sup>⊥</sup> ∧*z*) = *z* exists because *x* ≤ *x*∨*z*⊥). For lattices this axiom is equivalent to each of:


Every Boolean algebra is a Heyting algebra, but not *vice versa*; a Heyting algebra is Boolean iff one and hence both of the following equivalent conditions hold:

$$
\neg \neg \chi = \chi \qquad\qquad\qquad\qquad\quad(\chi \in X);\tag{D.20}
$$

$$(\neg x) \lor x = \top \qquad\qquad\qquad\qquad (x \in X),\tag{D.21}$$

which state the *law of the excluded middle* (famously denied by Brouwer).

The following result will be used implicitly throughout the main text.

Proposition D.2. *An order isomorphism of a lattice preserves all suprema and infima that exist. Hence in a complete lattice all suprema and infima are preserved.*

An important source of orthocomplemented lattices is provided by (possibly infinite-dimensional) complex vector spaces *V* with inner product, cf. Definition A.1: the elements of *X* are the *orthoclosed* subspaces *L* ⊂ *V*, i.e., those subspaces for which *L*⊥⊥ = *L*, where *L*⊥⊥ = (*L*⊥)⊥, and orthocomplementation is defined by

$$L^\perp = \{ \nu \in V \mid \forall w \in L : \langle \nu, w \rangle = 0 \},\tag{D.22}$$

and the partial ordering is given by inclusion. This yields

$$L \wedge M = L \cap M; \tag{\text{D.23}}$$

$$L \vee M = (L + M)^{\perp \perp} = (L^{\perp} \cap M^{\perp})^{\perp},\tag{D.24}$$

where *L*+*M* is the linear span of *L* and *M*. We have the *Amemiya–Araki Theorem:*

Theorem D.3. *The lattice of orthoclosed subspaces of an inner product space V is orthomodular iff V is* complete *in the norm* (A.2) *associated to the inner product.*

A space *X* is called *totally disconnected* if it has no other connected subspaces than its points (so any larger subspace = *X* is the union of two proper clopen sets).

#### Definition D.4. *A* Stone space *is a totally disconnected compact Hausdorff space.*

Any finite set (with the discrete topology) is a Stone space. The best-known example of an infinite Stone space is the Cantor set {0,1}<sup>N</sup> with product topology, which in addition is metrizable and has no isolated points (these properties even characterize the Cantor set up to homeomorphism). *Stone's Representation Theorem* reads:

Theorem D.5. *A lattice L is Boolean iff it is isomorphic to the lattice* Clopen(*X*) *of all clopen subsets of some Stone space X (partially ordered by set-theoretic inclusion), where X is uniquely determined by L up to homeomorphism.*

Thus the lattice operations in Clopen(*X*) are simply geven set-theoretically by

$$U \lor W = U \cup W;\tag{\text{D.25}}$$

$$U \wedge V = U \cap W,\tag{D.26}$$

with orthocomplementation given by set-theoretic complementation (the theorem is obviously predicated on the fact that such a lattice is Boolean). The space *X* is called the *Stone spectrum* of *L*, generically denoted by S (*L*). Just like Gelfand duality, Theorem D.5 extends to a categorical duality theorem in an obvious way.

The Stone spectrum S (*L*) of *L* has the following canonical realizations:

1. Consider the space Pt(*L*) = Hom(*L*,2), where 2 = {0,1} is seen as a Boolean lattice ordered by 0 ≤ 1 (and 0 6= 1), with topology inherited from the product topology on 2*<sup>L</sup>* . That is, the basic opens in Pt(*L*) are the sets

$$U\_{\mathfrak{x}} = \{ \mathfrak{q} \in \operatorname{Pt}(L) \mid \mathfrak{q}(\mathfrak{x}) = 1 \},\tag{\mathsf{D}.27}$$

where *x* ∈ *L*, and similarly with 1 0. This is a Stone space, with isomorphism

$$L \stackrel{\cong}{\rightarrow} \text{Clopen}(\text{Pt}(L));\tag{\text{D.28}}$$

$$
\mathfrak{x} \mapsto U\_{\mathfrak{x}}.\tag{\mathsf{D}.29}
$$

2. Generalizing the case of a power set (cf. Definition B.49), a *filter* in a (Boolean) lattice *L* is a nonempty subset *F* ⊂ *L* such that *x*, *y* ∈ *F* implies *x* ∧ *y* ∈ *F*, and *y* ≥ *x* ∈ *F* implies *y* ∈ *F* (whence 1 ∈ *F*). A filter *F* is *proper* if *F* 6= *L*, which is the case iff 0 ∈/ *F*. An *ultrafilter* is a filter that is maximal in the set of all proper filters, ordered by inclusion. Ultrafilters (i.e. maximal filters) in a Boolean lattice are the same as *prime filters*, which are filters for which *x*∨*y* ∈ *F* implies *x* ∈ *F* or *y* ∈ *F*. More generally, in a distributive lattice with 0 any maximal filter is prime, and the presence of an orthocomplementation also gives the converse inclusion. Moreover, a filter *F* in a Boolean lattice is maximal (and hence prime) iff for any *x* ∈ *L* either *x* ∈ *F* or *x* <sup>⊥</sup> ∈ *F* (but not both). For *x* ∈ *L*, let

$$U\_{\chi}^{\prime} = \{ F \in \mathcal{U} \: (L) \mid \chi \in F \}, \tag{D.30}$$

where U (*L*) is the set of all ultrafilters on *L*. One has *U* 0 *<sup>x</sup>* ∩*U* 0 *<sup>y</sup>* = *U* 0 *x*∧*y* , as well as *U* 0 *<sup>x</sup>* ∪*U* 0 *<sup>y</sup>* = *U* 0 *x*∨*y* , *U* 0 *<sup>x</sup>* ⊆ *U* 0 *y* if *x* ≤ *y*, and subsets *U* 0 *<sup>x</sup>* ⊂ U (*L*) form the basis of a topology on U (*L*) whose open sets are sets *U* <sup>0</sup> ⊆ U (*L*) with the property that for each *F* ∈ *U* 0 there is *x* ∈ *L* with *F* ∈ *U* 0 *<sup>x</sup>* ⊆*U* 0 . This topology makes U (*L*) a Stone space, whose basis of clopen sets is given by the *U* 0 *x* , *x* ∈ *L*, with isomorphism

$$L \stackrel{\cong}{\rightarrow} \text{Clopen}(\mathcal{U}(L));\tag{D.31}$$

$$
\mathfrak{x} \mapsto U'\_{\mathfrak{x}}.\tag{\mathsf{D}.32}
$$

3. Instead of *filters*, one may consider the dual notion of *ideals*, obtained by reversing the order (and hence swapping ∧ and ∨). Thus an *ideal* in *L* is a subset *I* ⊆ *L* such that *x*, *y* ∈ *I* implies *x*∨*y* ∈ *I*, and *y* ≤ *x* ∈ *I* implies *y* ∈ *I* (whence 0 ∈ *I*). An ideal *I* is *proper* if *I* = *L*, which is the case iff 1 ∈/ *I*. A *maximal ideal* is an ideal that is maximal in the set of all proper ideals, ordered by inclusion. In a Boolean lattice, maximal ideals coincide with *prime ideals*, which are ideals *I* that do not contain 1, and where *x*∧*y* ∈ *I* implies *x* ∈ *I* or *y* ∈ *I*. In a distributive lattice with 0 any maximal ideal is prime. The (set-theoretic) complement of a maximal ideal is a maximal filter (i.e. an ultrafilter), so that an ideal *I* in a Boolean lattice is maximal (and hence prime) iff for any *x* ∈ *L* either *x* ∈ *I* or *x*<sup>⊥</sup> ∈ *I* (but not both). The space I (*L*) of all maximal (i.e. prime) ideals in *L* is topologized by basic opens *U <sup>x</sup>* = {*I* ∈ I (*L*) | *x* ∈/ *I*}, and so this time the desired isomorphism is

$$L \stackrel{\simeq}{\to} \text{Clopen}(\mathcal{J}'(L));\tag{D.33}$$

$$
\mathfrak{x} \mapsto U\_{\mathfrak{x}}''.\tag{\text{D.34}}
$$

4. Finally, the set Idl(*L*) of *all* ideals in a (Boolean) lattice *L* is a frame if it is partially ordered by inclusion (cf. §C.11). One may realize the points of the frame Idl(*L*) as its prime elements (cf. Lemma C.85), which are simply the prime (and hence maximal) ideals in *L* considered above. Hence Pt(Idl(*L*)) forms a model of the Stone spectrum *X* of *L*, too. The advantage of this realization is that it gives a direct description of the topology of *X* (seen as a frame), namely as

$$\mathcal{O}(X) \cong \text{Idl}(L). \tag{D.35}$$

The relationship between the first three approaches is that for any ϕ ∈ Pt(*L*), the set <sup>ϕ</sup>−1({1}) is a maximal filter in *<sup>L</sup>*, whose complement <sup>ϕ</sup>−1({0}) is a maximal ideal. This can be shown to give homeomorphisms Pt(*L*) ∼= U (*L*) ∼= I (*L*), under which the opens *Ux*, *U <sup>x</sup>*, and *U <sup>x</sup>* are mapped to each other. The (contravariant) functorial nature of the Stone spectrum comes out particularly clearly in the first description: given a homomorphism *h* : *L* → *L* , we immediately obtain a map *h*∗ : Pt(*L* ) → Pt(*L*) by pullback (i.e., *h*∗ϕ = ϕ ◦ *h*). In this description the isomorphism *X* <sup>∼</sup><sup>=</sup> → Pt(Clopen(*X*)) is given by *x* → ϕ*x*, where ϕ*x*(*U*) = 1*<sup>U</sup>* (*x*), with *<sup>U</sup>* <sup>∈</sup> Clopen(*X*). In the second description, the isomorphism *<sup>X</sup>* <sup>∼</sup><sup>=</sup> → U (Clopen(*X*)) is given by *x* → {*U* ∈ Clopen(*X*) | *x* ∈ *U*}, which also gives the isomorphism *X* <sup>∼</sup><sup>=</sup> → I (Clopen(*X*)) of the third description as *x* → {*U* ∈ Clopen(*X*) | *x* ∈/ *U*}.

Eq. (D.35) follows from Theorem D.5, which implies an isomorphism of frames

$$\mathcal{O}(X) \stackrel{\cong}{\to} \text{Idl}(\text{Clopen}(X));\tag{\text{D.36}}$$

$$U \mapsto \{ V \in \text{Clopen}(X) \mid V \subseteq U \},\tag{D.37}$$

with inverse *<sup>I</sup>* → <sup>+</sup> *<sup>U</sup>*∈*<sup>I</sup> U*. However, by itself, eq. (D.35) may also be taken as a constructive version of Stone's Representation Theorem; the next, non-constructive step (relying on Zorn's Lemma) then gives the points of *X* from Idl(*L*), cf. §C.11.

To close this brief introduction to lattice theory, we present a general construction of free distributive lattices, possibly with relations, which will be needed for the theory of the constructive Gelfand spectrum in §12.2. The main advantage of this construction is that it can be performed in any topos, as will indeed be done in §12.4.

Definition D.6. *The* free distributive lattice L*<sup>S</sup> on a set S is the set of* irredundant *finite subsets* {*A*1,...,*An*} *of the finite power set* P*<sup>f</sup> of S, i.e., Ai* ⊂ *S,* |*Ai*| < ∞*, n* ∈ N*, and no Ai is a proper subset of any Aj, with lattice operations inductively generated (using distributivity) from the following singleton cases:*

$$\{\{s\}\} \vee \{\{t\}\} = \{\{s\}, \{t\}\};\tag{D.38}$$

$$\{\{\mathbf{i}\}\} \wedge \{\{\mathbf{i}\}\} = \{\{\mathbf{i}, \mathbf{t}\}\}.\tag{\mathcal{D}.39}$$

For {*A*1,...,*An*} ∈ L*<sup>S</sup>* as above, and similarly {*B*1,...,*Bm*} ∈ L*S*, these rules imply

$$\{A\_1, \ldots, A\_n\} \vee \{B\_1, \ldots, B\_m\} = \{A\_1, \ldots, A\_n, B\_1, \ldots, B\_m\}\_{\text{ir}};\tag{D.40}$$

$$\{A\_1, \dots, A\_n\} \wedge \{B\_1, \dots, B\_m\} = \{A\_l \cup B\_j \mid i = 1, \dots, n, j = 1, \dots, m\}\_{\text{ir}},\qquad(\text{D.41})$$

where the subscript ir means that redundancies in the above sense have been removed by deleting any set on the list that properly contains some other set on the list. The motivation for this rule is that, using distributivity, any element *x* of a distributive lattice can be brought into the ("normal") form *x* = *x*1∨···∨*xn*, where each *xi* = *y* (1) *<sup>i</sup>* ∧···∧*y* (*mi*) *<sup>i</sup>* is a finite meet. We then identify *Ai* with {*y* (1) *<sup>i</sup>* ,··· , *y* (*mi*) *<sup>i</sup>* }, so that *xi* <sup>=</sup> <sup>4</sup>*Ai*, and identify {*A*1,...,*An*} with *<sup>x</sup>*<sup>1</sup> ∨···∨*xn*. If we allow empty sets (as we do), then <sup>L</sup>*<sup>S</sup>* has both a bottom element <sup>⊥</sup> <sup>=</sup> 0 and a top element / # <sup>=</sup> <sup>4</sup>0. /

Consequently, an equivalent description of L*<sup>S</sup>* is to first define the set Σ of all formal expressions inductively defined by the rules: (i) *S* ⊂ Σ, ⊥ ∈ Σ, and # ∈ Σ; (ii) if *x* ∈ Σ and *y* ∈ Σ, then *x*∨*y* ∈ Σ and *x*∧*y* ∈ Σ. Secondly, we quotient Σ by the equivalence relation generated by all of the basic identities in a distributive lattice, i.e., the commutativity, associativity, idempotency, and distributivity laws for ∨ and ∧, the rules *x* ∨ ⊥ = *x* and *x* ∧ # = *x*, and the absorption law *x* ∨ (*x* ∧ *y*) = *x*. The lattice operations on the quotient are the ones inherited from concatenation on Σ.

As in most free constructions, the map *S* → L*<sup>S</sup>* is left adjoint to the forgetful functor from the category of distributive lattices into Sets. One has a canonical map *i* : *S* → L*S*, given by *i*(*s*) = {*s*}, with the universal property that any function *f* : *S* → *L* from *S* to some distributive lattice *L* factors through L*S*, i.e., there is a unique lattice homomorphism *g* : L*<sup>S</sup>* → *L* such that *f* = *g* ◦ *i*. Indeed, *g* may be inductively generated from the special case *g*({{*s*}}) = *f*(*s*) using the rules (D.38) - (D.39).

One may enrich this construction by introducing a congruence ∼ on L*S*, e.g., one generated by relations *xi* = *yi*, *i* ∈ *I*. In that case, the ensuing quotient L*S*/ ∼ exists, and is universal for homomorphisms *f* : L*<sup>S</sup>* → *M* of distributive lattices that satisfy *f*(*xi*) = *f*(*yi*), i.e., if *p* : L*<sup>S</sup>* → L*S*/ ∼ is the canonical projection, there is a unique homomorphisms of distributive lattices *g* : (L*S*/ ∼) → *M* such that *f* = *g* ◦ *p*.

#### D.2 Propositional logic

The topos-theoretical approach to quantum logic discussed in Chapter 12 uses an advanced version of an elementary construction in algebraic logic that relates classical propositional logic to Boolean algebras (or lattices), and similarly relates intuitionistic propositional logic to Heyting algebras. Tough easy to state, these relationships are conceptually quite deep, based as they are on a separation between syntax and semantics that is decidedly "modern", reflecting a view on the nature of mathematics that would have been completely foreign to e.g. Newton and Euler or even Gauss, not to speak of Euclid and Archimedes, notwithstanding their use of the axiomaticdeductive method that has been a defining property of (real) mathematics since its birth in Plato's Academy. As expressed by Boole himself, this modern view is:

'They who are acquainted with the present state of the theory of Symbolic Algebra, are aware that the validity of the processes of analysis does not depend upon the interpretation of the symbols which are employed, but solely upon the laws of their combination.' (Boole, 1847, Preface)

The formalization of mathematics starts with *propositional logic*, whose notation consists of the following groups of symbols in terms of which a *theory* is defined:


As in arithmetic, there is some ambiguity to be dispelled. This may be done either by introducing *brackets* (, ), subject to obvious rules we omit, or by conventions to the effect that ¬ "binds" symbols more strongly that ∨ and ∧, which in turn "bind" more strongly than →. For example, ¬α ∨δ → β ∧γ is the same as ((¬α)∨δ) → (β ∧γ).

In propositional logic (unlike in first-order logic), *well-formed formulae* and *propositions* coincide; typically denoted by Greek letters α,β,..., both are defined as expressions in the above symbols that (iteratively) arise in the following way:


Also here one may use brackets in the obvious way, e.g., if α is *p*<sup>1</sup> → *p*2, and β is *p*<sup>1</sup> ∧ *p*3, then (*p*<sup>1</sup> → *p*2) → (*p*<sup>1</sup> ∧ *p*3) is the same as α → β.

For example, one may check that the following expression is a valid proposition:

$$(p\_1 \to (p\_2 \to p\_3)) \to ((p\_1 \to p\_2) \to (p\_1 \to p\_3)).\tag{D.42}$$

A final informal symbol we use is ≡, as in α ≡ β, which has no logical meaning, but states that α is the same as β (e.g., for α ≡ (*p*<sup>1</sup> → (*p*<sup>2</sup> → *p*3)), consider ¬α).

The notion of a (propositional) *theory* will be picked up later, but we now interrupt the construction of the syntax of propositional logic and discuss its *semantics*. In its most elementary form, this means that there is a *valuation* on Σ, i.e.,

$$V: \Sigma \to \{0, 1\}, \tag{D.43}$$

also called a *truth function*, where 0 = false and 1 = true; one often writes α = 1 for *V*(α) = 1 (i.e., α is true, and α = 0 if α is false (this formally introduces a new symbol "=", which however is foreign to propositional logic). Let *B*<sup>Σ</sup> be the set of all propositions (i.e., well-formed formulae) on the given signature Σ. With abuse of notation (justified by the property Σ ⊂ *B*<sup>Σ</sup> ), *V* uniquely extends to a function

$$V: B\_{\Sigma} \to \{0, 1\},\tag{D.44}$$

as follows. First, each *V*(*pi*) is fixed by the given function (D.43). Second, the value of *V* on compound expressions is (iteratively) determined through the use of *truth tables*, which formalize the everyday meaning of the symbols ¬,∧,∨,→:


The first table should be read as follows: if α is false, then ¬α is true, and if α is true, than ¬α is false. Similarly, the second table means that if α and β are both false, then so is α ∧β, etc. For example, to see if γ ≡ *p*1∧(¬*p*2) is true or false given the valuation *p*<sup>1</sup> = *p*<sup>2</sup> = 0, we first look at the truth table for ¬ with α ≡ *p*2, inferring from the first row that ¬*p*<sup>2</sup> = 1 als *p*<sup>2</sup> = 0. We subsequently inspect the table for ∧ with α ≡ *p*<sup>1</sup> and β ≡ ¬*p*2. Since *p*<sup>1</sup> = *p*<sup>2</sup> = 0 is the same as *p*<sup>1</sup> = 0 and ¬*p*<sup>2</sup> = 1, we look at the second row, obtaining γ = 0. Another example, just involving the implication symbol →, is (D.42), given e.g. *p*<sup>1</sup> = 1, *p*<sup>2</sup> = 0, and *p*<sup>3</sup> = 1. This is settled through the following steps, each of which involves the table for →:


$$(p\_1 \to (p\_2 \to p\_3)) \to ((p\_1 \to p\_2) \to (p\_1 \to p\_3)) = 1. \tag{D.45}$$

The proposition in (D.42) is actually rather special, in that *all* truth values for the atomic propositions (*p*1, *p*2, *p*3) it contains make it true (as is easily checked).

Definition D.7. *A proposition* ϕ *that is true whatever the (un)truth of the atomic propositions it contains, is called a* tautology*, denoted by* ϕ*.*

For example, α → α is a tautology for any proposition α; this follows from the truth table for → by replacing β by α, in which case only the first and the fourth rows are consistent (both yielding 1). Introducing a new logical symbol ↔ by stipulating that α ↔ β is the same as (α → β)∧(β → α), then one easily proves:

Theorem D.8. *The proposition* α ↔ β *is a tautology iff* α *and* β *are either both true or both false for each joint truth value of the atomic propositions they contain.*

Here α and β need not contain the same atomic propositions, but if they do, this proposition says that α ↔ β is a tautology iff α and β have the same truth table.

Here and in what follows, one should distinguish theorems *about* logic from theorems *within* logic. The former are themselves derived from logical rules that can be formalized, as first done by Hilbert and his school in "meta-mathematics". The latter is what we now turn to, motivated by the above semantic intermezzo. The syntax of any logical system, such as propositional logic, is completed by stating axioms and deduction rules that enable one to prove *theorems*. In the case of propositional logic, these are propositions (i.e., expressions correctly formed from rules i) and ii) above) that can be derived from the axioms and deduction rules in a finite number of steps, starting with (some of) the axioms and applying (some of) the deduction rules to the previous step of the proof. The axioms are considered to be theorems, too. Theorems are often denoted by ϕ, and to show that a proposition ϕ is indeed a theorem we write / ϕ. Thus the question if / ϕ holds is purely syntactic, and hence is independent of the truth-value of the atomic propositions *pi* in ϕ.

This is a baby version of the fundamental idea of Boole mentioned above, that the possible meaning of mathematical symbols should not affect the validity of mathematical reasoning about them, Nonetheless, there is a consistency requirement (on the axioms and deduction rules) that one should not be able to derive ϕ if ϕ is semantically false under some truth assignment to the atomic propositions it contains. In other words, *a theorem must be true for any truth assignment to the pertinent atomic propositions*, or, then again, *a theorem within propositional logic must be a tautology*, symbolically: / ϕ implies ϕ (meta-mathematically). This is the *soundness* condition on any logical system. Conversely, one would like to prove as many true propositions as possible. Optimally, this is expressed by the *completeness* condition that ϕ imply / ϕ. If both hold, i.e., if a system is sound as well as complete, one has / ϕ iff ϕ: in (other) words, *a proposition is a theorem iff it is a tautology*.

Achieving this should be the goal of our axioms and deduction rules. This can indeed be done in propositional logic (and also in first-order logic, on a suitable interpretation of , see §D.4). Even this requirement does not fix the axioms and the deduction rules, although it clearly makes any two such systems equivalent, in the sense that each leads to the same theorems (namely the tautologies). In particular, one can switch between axioms and deduction rules (matters like this were first systematically sorted out by Hilbert and his school, notably Bernays and Ackermann, partly motivated by the *Principia Mathematica* of Russell and Whitehead).

One particularly convenient choice has just a single *deduction rule*, namely:

• *Modus ponens*: if / α and / α → β, then / β.

Even so, the *axioms* of propositional logic may be stated in many different ways. Although it is even possible to use a single logical symbol (namely the *Sheffer stroke* |, called NAND in computer science, where α|β means ¬(α ∧β)), we proceed less radically and initially use two symbols. To this end, it is easy to show that

$$
\alpha \land \beta \leftrightarrow \neg(\alpha \to \neg \beta) \tag{D.46}
$$

$$
\alpha \lor \beta \leftrightarrow \neg \alpha \to \beta \tag{D.47}
$$

are tautologies, so that in principe the symbols ∨ and ∧ are superfluous, in that α ∧β may be regarded as an abbreviation of ¬(α → ¬β), and likewise, α ∨β stands for ¬α → β. A possible choice for the axioms that regulate ¬ and →, is:

$$
\vdash \mathcal{B} \to (\mathcal{a} \to \mathcal{B});
\tag{D.48}
$$

$$\vdash (\mathcal{B}\rightarrow(\mathcal{Y}\rightarrow\mathcal{S}))\rightarrow((\mathcal{B}\rightarrow\mathcal{Y})\rightarrow(\mathcal{B}\rightarrow\mathcal{S}));\tag{D.49}$$

$$\vdash (\neg \alpha \to \neg \beta) \to ((\neg \alpha \to \beta) \to \alpha). \tag{D.50}$$

The third axiom axiom settles the use of ¬ and, jointly, with *modus ponens*, justifies *proof by contradiction* or *reductio ad absurdum*: suppose one has established

$$
\vdash \neg \alpha \to \beta;\tag{D.51}
$$

$$
\vdash \neg \alpha \to \neg \beta,\tag{D.52}
$$

then (D.50) and *modus ponens* yield (¬α → β) → α. Axiom (D.48) and *modus ponens* then yield α. Furthermore, as another proof technique (i.e. a theorem *about* propositional calculus) one can prove the *deduction theorem*:

Theorem D.9. *If* α *and* (γ1,..., γ*n*) *imply* β*, then* (γ1,..., γ*n*) *imply* / α → β*.*

Introducing an external implication symbol ⇒, such statements are often written:

$$(\alpha, \gamma, \ldots, \gamma\_l) \vdash \beta \Rightarrow (\gamma, \ldots, \gamma\_l) \vdash \alpha \rightarrow \beta. \tag{D.53}$$

Writing the external "and" as a comma, one can similarly prove the rules

$$
\beta \to \gamma, \gamma \to \delta \Rightarrow \beta \to \delta; \tag{D.54}
$$

$$
\beta \to (\gamma \to \delta), \gamma \Rightarrow \beta \to \delta. \tag{D.S5}
$$

As already mentioned, the central result about propositional logic is:

Theorem D.10. *For any proposition* ϕ*, one has* / ϕ *iff* ϕ*.*

*Proof.* We only prove the easy direction. Axioms are tautologies, and *modus ponens* preserves truth, in that α and α → β imply β, as follows from the fourth row of the truth table for α → β. Hence each step in a proof preserves tautologies. -

Nonetheless, the notions of *theorem* and *tautology* are quite different conceptually: the first is defined syntactically, whereas the latter is defined semantically.

At the other end of the spectrum, we mention an axiom system that involves all four logical connectives (whilst keeping *modus ponens* as the only deduction rule):

$$
\vdash (\mathcal{B} \land \gamma) \to \mathcal{B};\tag{D.56}
$$

$$
\vdash (\mathcal{B} \land \mathcal{Y}) \to \mathcal{Y};\tag{D.57}
$$

$$
\vdash \mathcal{B} \to (\gamma \to (\mathcal{B} \land \gamma));
\tag{D.58}
$$

$$
\vdash \mathcal{B} \to (\mathcal{B} \lor \mathcal{Y});
\tag{D.59}
$$

$$
\vdash \gamma \rightarrow (\not p \lor \gamma);
\tag{D.60}
$$

$$\vdash (\mathcal{B}\rightarrow\mathcal{S})\rightarrow((\gamma\rightarrow\mathfrak{d})\rightarrow((\mathcal{B}\vee\gamma)\rightarrow\mathfrak{d}));\tag{D.61}$$

$$
\vdash \mathcal{B} \to (\gamma \to \mathcal{B});
\tag{D.62}
$$

$$\vdash (\mathcal{B}\rightarrow(\gamma\rightarrow\mathfrak{d}))\rightarrow((\mathcal{B}\rightarrow\gamma)\rightarrow(\mathcal{B}\rightarrow\delta));\tag{D.63}$$

$$
\vdash \neg \beta \rightarrow (\beta \rightarrow \gamma);
\tag{D.64}$$

$$\vdash (\mathcal{B} \rightarrow \mathcal{\mathcal{Y}}) \rightarrow ((\mathcal{B} \rightarrow \neg \mathcal{\mathcal{Y}}) \rightarrow \neg \mathcal{\mathcal{B}});\tag{D.65}$$

$$
\vdash \neg \neg \beta \to \beta. \tag{D.66}
$$

We now describe the relationship between propositional logic and Boolean algebras. Define an equivalence relation ∼ on the set *B*<sup>Σ</sup> of propositions by

$$
\mathfrak{g} \sim \Psi \text{ iff } \mathfrak{\upmu} \vdash \mathfrak{\varmu} \text{ and } \mathfrak{\upmu} \vdash \mathfrak{\upmu}, \tag{D.67}
$$

where, as in (D.53), the notation ψ / ϕ means that ϕ can be derived from ψ, which is the case iff / ψ ↔ ϕ. The ensuing set of equivalence classes

$$L\_{\Sigma} = B\_{\Sigma} / \sim \tag{D.68}$$

is called the (classical) *Lindenbaum (–Tarski) algebra* for the given signature Σ.

Theorem D.11. *The set L*<sup>Σ</sup> *defined by* (D.68) *is partially ordered by*

$$[\Psi] \le [\mathfrak{q}] \text{ if } \mathfrak{q} \vdash \mathfrak{q}. \tag{\text{D.69}}$$

*In this ordering, the ensuing poset is a Boolean algebra, with operations*

$$[\Psi] \vee [\mathfrak{q}] = [\Psi \vee \mathfrak{q}];\tag{\mathsf{D}.70}$$

$$[\Psi] \wedge [\Phi] = [\Psi \wedge \Phi];\tag{\text{D.71}}$$

$$[\Psi]^\perp = [\neg \Psi]. \tag{D.72}$$

*Furthermore, the bottom and top elements of L*<sup>Σ</sup> *are the equivalence classes of any contradiction and any tautology, respectively. The Boolean algebra L*<sup>Σ</sup> *thus obtained is the free Boolean algebra* B<sup>Σ</sup> *on the set* Σ*, and hence any valuation* (D.44) *-* (D.43)*, induces a homomorphism of Boolean algebras*

$$V: \mathcal{A}\_{\Sigma} \to \{0, 1\}. \tag{D.73}$$

Here the *free Boolean algebra* B<sup>Σ</sup> on a set Σ is defined as usual, namely as "the" Boolean algebra (unique up to isomorphism), along with an injection ι : Σ → B<sup>Σ</sup> , such that any map *g* : Σ → *A*, where *A* is some Boolean algebra, factors through ι (i.e., there is a unique homomorphism *f* : B<sup>Σ</sup> → *A* such that *g* = *f* ◦ ι).

Constructions like this become more interesting for propositional *theories*, in which (beyond specifying the signature Σ) further axioms are added to whatever system for which Theorem D.10 holds. Let us call the list of such axioms T , where we assume that the theory is *consistent*, in that no contradiction can be derived from T (in propositional logic—as opposed to predicate logic—this question is decidable). We also assume that T contains no tautologies (which would add no new theorems). We now write T / ϕ if ϕ can be derived (in a finite number of steps) from T and the basic axioms and decuction rule(s). Unless T is empty, the set of theorems will be larger now (e.g., any member of T itself, say / *p*1, is trivially a theorem of T ). In order to preserve Theorem D.10, now in the form

$$
\mathcal{P} \vdash \mathfrak{q} \text{ iff } \mathcal{P} \models \mathfrak{q}, \tag{D.74}
$$

we should define the right-hand side appropriately. Call a valuation (D.43), or, equivalently, the corresponding homomorphism (D.73), a *binary model* of T if

$$V(\mathcal{a}) = 1,\tag{D.75}$$

for each α ∈ T ⊂ *B*<sup>Σ</sup> (by soundness this is already the case for the axioms of propositional logic *per se*). We then say that T ϕ iff *V*(ϕ) = 1 (i.e., ϕ is true) in any binary model of T . On this definition of T , eq. (D.74), and hence Theorem D.10 (with T added to the axioms), holds. Moreover, for α,β ∈ *B*<sup>Σ</sup> , define

$$
\mathfrak{a} \sim\_{\mathcal{F}} \mathfrak{F} \text{ iff } \mathcal{F} \vdash (\mathfrak{a} \leftrightarrow \mathfrak{B}), \tag{D.76}
$$

where the right-hand side stands for (T ,α) / β and (T ,β) / α. Then define

$$L\_{(\Sigma,\mathcal{F})} = \mathcal{B}\_{\Sigma} / \sim \mathcal{F},\tag{D.77}$$

and (partially) order *L*(Σ,<sup>T</sup> ) by [ψ] ≤ [ϕ] iff (T ,ψ) / ϕ; as before, this is equivalent with T / (ψ → ϕ). This construction obviously generalizes (D.67), etc. Then Theorem D.11 holds (*mutatis mutandis*) for *L*(Σ,<sup>T</sup> ). In particular, *L*(Σ,<sup>T</sup> ) is a Boolean algebra, which can also be shown to have the following universal property.

A *model* of T in some Boolean algebra *B* is a map *V* : Σ → *B* whose unique extension *V* : *B*<sup>Σ</sup> → *B* makes the axioms of T true, i.e. *V*(ϕ) = # for each ϕ ∈ T (where # is the top element of *B*). Note that α → [α] is a model of T in *L*(Σ,<sup>T</sup> ).

Theorem D.12. *For each model V* : Σ → *B of* T *, there is a unique homomorphism V* : *L*(Σ,<sup>T</sup> ) → *B of Boolean algebras such that V*(α) = *V* ([α]) *for each* α ∈ *B*<sup>Σ</sup> *.*

#### D.3 Intuitionistic propositional logic

In view of its importance for quantum mechanics and topos theory, we now briefly discuss the intuitionistic version of the preceding material on (classical) propositional logic. Intuitionism in mathematics originated with the Dutch mathematician L.E.J. Brouwer (1881–1966), who was also one of the most important early contributors to the field of (algebraic) topology. Brouwer held a rather subjective view of mathematics (sometimes even tending towards solipsism), in which mathematics primarily resided within the mind of the "creative subject" (perhaps the right translation of Brouwers' "scheppend" is: "creating" rather than "creative"). Any means of communication supposedly weakened this effort, so that Brouwer saw the formalization of mathematics (including logic) as secondary and even potentially dangerous; he openly (and polemically) opposed his views to the "formalism" he attributed to Hilbert, with whom he also fell out personally. A more technical consequence of Brouwer's intuitionism was an emphasis on explicit *constructions*, rejecting not only proofs by contradiction, but even the abstract existence of mathematical objects in general (as claimed by the so-called *Platonic* philosophy of mathematics).

Brouwer's lasting influence on logic is partly due to his student Arend Heyting (1989–1980), who was less radical than his teacher and formalized (!) intuitionistic logic analogously to its classical counterpart. In fact, the system (D.56) - (D.65), with *modus ponens*, gives axioms for *intuitionistic propositional logic*, which therefore differs from classical propositional logic exclusively by the absence of the law of the excluded middle (D.66). It is customary in intuitionistic logic to use the purely logical symbols ∧,∨,→ and ⊥, in terms of which negation is defined by

$$
\neg \alpha \equiv \alpha \to \bot \tag{D.78}
$$

In that case, axiom (D.65) is simply replaced by

$$
\vdash \bot \to \mathfrak{a},
\tag{D.79}
$$

and in the presence of (D.56) - (D.64) with (D.79), the axiom that makes the system classical may now be formulated as the validity of *reductio ad absurdum*, i.e.,

$$\vdash ( (\mathfrak{a} \to \bot) \to \bot ) \to \mathfrak{a} , \tag{\text{D.80}}$$

which is therefore denied in intuitionistic logic. Similarly, classical rules like:

$$
\alpha \lor \neg \alpha;\tag{D.81}
$$

$$
\neg \neg \alpha \lor \neg \alpha;\tag{\mathsf{D.82}};
$$

$$(\neg \alpha \to \neg \beta) \to (\beta \to \alpha);\tag{D.83}$$

$$(\alpha \to \beta) \lor (\beta \to \alpha);\tag{D.84}$$

¬(¬α ∧ ¬β) → (α ∨β); (D.85)

$$\neg(\neg \alpha \lor \neg \beta) \to (\alpha \land \beta),\tag{D.86}$$

are invalid in intuitionistic logic, as is, of course, (D.66). Fortunately, as theorems of intuitionistic propositional logic one does have:

$$
\vdash a \rightarrow \neg \neg a;\tag{D.87}
$$

$$\vdash \neg \neg \neg \alpha \leftrightarrow \neg \alpha;\tag{D.88}$$

$$\vdash (\alpha \rightarrow \beta) \rightarrow (\neg \beta \rightarrow \neg \alpha); \tag{D.89}$$

$$
\vdash \neg \alpha \lor \neg \beta \to \neg (\alpha \land \beta);
\tag{D.90}
$$

$$\vdash \neg(\alpha \lor \beta) \to (\neg \alpha \land \neg \beta); \tag{D.91}$$

$$\vdash (\alpha \rightarrow \beta) \rightarrow (\neg \beta \rightarrow \neg \alpha). \tag{D.92}$$

More generally, Godel's ¨ *negative translation* of classical (propositional) logic into intuitionistic (propositional) logic establishes the fact that if one puts ¬¬ in front of atomic propositions and recursively replaces α ∨β by ¬(¬α ∧¬β), which changes nothing classically, the ensuing proposition is intuitionistically valid. In this sense, intuitionistic logic is *stronger* than classical logic, although at first sight it looks *weaker* (as is has fewer axioms). Also more generally, one often sees that classical results whose proofs apparently rely on intuitionistically invalid reasoning are classically equivalent to intuitionistically valid results. (e.g. Gelfand duality).

A natural (and complete) semantics for intuitionistic propositional logic is given by Heyting algebras (replacing the Boolean algebras of the classical case). Let *I*<sup>Σ</sup> denote the set of all propositions (i.e., well-formed formulae) on some signature Σ built from the letters *p* ∈ Σ and the symbols ∧,∨,→ and ⊥, where in formation rule i) preceding (D.42) we also declare ⊥ to be a proposition, and we omit ¬α at the end of rule ii), as it is a special case of the preceding part with (D.78). If *H* is a Heyting algebra, we may then extend any function *V* : Σ → *H* to a function

$$V: I\_{\Sigma} \to H \tag{D.93}$$

by recursively using the following rules, where • is ∧, ∨, or → in *I*<sup>Σ</sup> and  in *H*:

$$V(\bot) = \bot;\tag{D.94}$$

$$V(\mathfrak{a}\bullet\mathfrak{B}) = V(\mathfrak{a})\bullet V(\mathfrak{B}).\tag{\text{D.95}}$$

Then each axiom ϕ of intuitionistic propositional logic is valid, in that

$$V(\mathfrak{g}) = \top. \tag{D.96}$$

Moreover, if Γ is some finite set of propositions, then

$$
\Gamma \vdash \mathfrak{q} \text{ implies } V\left(\bigwedge \Gamma\right) \leq V(\mathfrak{q}).\tag{D.97}
$$

In particular, suppose we a theory T . As in the classical case, we call a valuation (D.93) a *model* of T if (D.96) holds for each ϕ ∈ T . It then follows from (D.97) that each model *V* of T is *sound* in that for all propositions ϕ one has the rule:

792 D Lattices and logic

$$\mathcal{F} \vdash \mathfrak{q} \text{ implies } V(\mathfrak{q}) = \top. \tag{D.98}$$

That is, ϕ is *true* in the given model. As in Theorem D.10, soundness and completeness of Heyting algebra semantics of intuitionistic propositional logic are then jointly expressed by the following result (where / denotes derivability using only the intuitionistically valid axioms (D.56) - (D.64) with (D.79), and *modus ponens*):

Theorem D.13. *For any theory* T *in intuitionistic propositional logic,* T / ϕ *holds iff* T ϕ*, i.e., V*(ϕ) = # *for all Heyting algebra models V* : *I*<sup>Σ</sup> → *H.*

The classical construction of the Lindenbaum algebra may also be copied by defining *L*<sup>Σ</sup> and *L*(Σ,<sup>T</sup> ) as (D.67) - (D.68), where this time the symbol / defining ∼ through (D.67) or (D.76) is the one using the intuitionistically valid axioms only. It follows that any Heyting algebra model *V* : *I*<sup>Σ</sup> → *H* factors through a homomorphism *L*<sup>Σ</sup> → *H* of Heyting algebras, just as in the classical case (cf. Theorem D.11).

*Kripke models* are special Heyting algebra models, which already form a complete semantics for intuitionistic propositional logic. For any poset *X*, the set

$$\text{Upper}(X) = \mathcal{O}(X) \tag{D.99}$$

of all upper subsets *U* of *X* (i.e. *y* ≤ *x* ∈ *U* implies *y* ∈ *U*), which by definition coincides with the set O(*X*) of open sets in the Alexandrov topology on *X*, is a Heyting algebra in the partial order defined by inclusion, with ∨ = ∪, ∧ = ∩, and

$$U \dashrightarrow V = \{ \mathbf{x} \in X \mid (\uparrow \mathbf{x}) \cap U \subseteq V \}. \tag{\mathbf{D}.100}$$

Given a valuation *V* : Σ → Upper(*X*) with associated Heyting algebra homomorphism *V* : *I*<sup>Σ</sup> → Upper(*X*), for any *x* ∈ *X* and ϕ ∈ *I*<sup>Σ</sup> we write *x* ϕ iff *x* ∈ *V*(ϕ), and say that *x forces* ϕ. Then *V*(ϕ) = # iff *x* ϕ for all *x* ∈ *X*, and we have:

$$\ge \mathbb{H} \cdot \mathfrak{g} \quad \text{and } \mathfrak{y} \ge \mathfrak{x} \text{ imply } \mathfrak{y} \Vdash \mathfrak{g}; \tag{\mathbb{D}.101}$$

*x* ⊥ for no *x* ∈ *X*; (D.102)

$$\ge \vdash \mathfrak{q} \land \Psi \text{ iff } x \Vdash \!\!\vdash \mathfrak{q} \text{ and } x \Vdash \!\!\vdash \Psi; \tag{\text{D.103}}$$

$$\ge \vdash \mathfrak{g} \lor \Psi \text{ iff } \mathtt{x} \Vdash \mathtt{q} \text{ or } \mathtt{x} \Vdash \mathtt{y}; \tag{\mathtt{D}.104}$$

$$\propto \vdash \mathfrak{p} \to \mathfrak{y} \quad \text{iff for all } \mathbf{y} \ge \mathbf{x} : \mathbf{y} \vdash \mathfrak{p} \text{ implies } \mathbf{y} \Vdash \mathfrak{y}; \tag{\mathbf{D}.105}$$

$$\propto \vdash \neg \varphi \quad \text{iff for all } \mathbf{y} \ge \mathbf{x}, \mathbf{y} \vdash \varphi \text{ is false.} \tag{\mathbf{D}.106}$$

Hence these are *properties* of any homomorphism *V* : *I*<sup>Σ</sup> → Upper(*X*); originally, (D.101) - (D.105), which imply (D.106), were taken to be *axioms* extending a binary "forcing" relation *x p* on *X* ×Σ to *X* ×*I*<sup>Σ</sup> . In topos theory, generalizations of the rules (D.101) - (D.106), once again theorems rather than axioms, will provide the Kripke–Joyal semantics of the (intuitionistic) internal logic op toposes (cf. §E.5).

#### D.4 First-order (predicate) logic

Propositional logic lacks the structure to describe arithmetic (not to speak of set theory), because it has neither variables—as we shall see, the symbols *pi* are not variables but *predicate symbols*—nor quantification symbols like 'there exists' (∃) and 'for all' (∀). This defect is remedied by the formalism of *predicate logic*, also called *first-order logic*, which was essentially introduced by Frege and was adopted by Hilbert's school as a universal language for mathematics (as they knew it), in which for example the *Zermelo–Fraenkel* (ZF) axioms for set theory may be formulated as a foundation of mathematics (against competitors like the *Principia Mathematica* system of Russell and Whitehead, and others). A simple mathematical theory that can be formalized using classical first-order logic is *Peano Arithmetic* (PA).

	- 1. The *purely logical symbols* are the familiar symbols ¬,∧,∨,→ from propositional logic (or some logically independent subset thereof, such as ¬ and →), supplemented by the equality sign = and the quantification symbols ∀ and ∃ (the latter is in fact superfluous in the classical system discussed here, since, the combination ∃*<sup>x</sup>* defined below is the same as ¬∀*x*¬).
	- 2. Unlike the ones above, the *non-logical symbols* (comprising the *signature* of the theory) depend on the field of mathematics to be formalized (such as set theory or arithmetic), but the general format is as follows. One has:
		- a. *Variables a*,*b*, *c*,..., *x*, *y*,*z*, *x*1, *x*2,..., assumed countable many at most. For example, in PA these variables may be thought of as denoting natural numbers, whereas in ZF they will be sets, but of course such *interpretations* do not form part of the syntax! This warning also applies to the next items. In *many-sorted theories* the variables are *sorted*, in that there is a set {*A*,*B*,...} of *sorts*, and each variable *x* ≡ *xA* belongs to one of these sorts.
		- b. *Constants*, arbitrarily formatted. For example, PA has just one constant, called 0, to be interpreted as the number zero. Also ZF has just a single (even superfluous) constant /0, to be interpreted as the empty set.
		- c. *Function symbols f*,*g*,.... Each such symbol has an *arity a*(*f*), which is a natural number indicating the number of variables it has (as formalized below). Formally, one allows *a*(*f*) = 0, in which case *f* is also a constant. PA has three function symbols, viz. *S*, +, and ×, with arities *a*(*S*) = 1, *a*(+) = 2, and *a*(×) = 2 (these will be interpreted as the successor function *n* → *n*+1, addition, and multiplication, respectively). Perhaps surprisingly (especially in the light of category theory), ZF *has no function symbols*: in set theory, functions *f* : *X* → *Y* are defined as special subsets of *X* ×*Y*.
		- d. *Predicate symbols P*,..., coming with an arity *a*(*P*) ∈ N, too. These will play a role in the construction of formulae, see below (some authors count = as a predicate symbol with arity 2, instead of as a purely logical symbol). PA has no predicate symbols. ZF has one predicate symbol ∈, with arity 2.
	- 1. *Term formation* is done by iterating the steps:
		- a. Any variable *xi* is a term.
		- b. Any constant is a term.
		- c. Any function symbol *f* and any set of *k* = *a*(*f*) terms (*t*1,...,*tk*) jointly yield a term *f*(*t*1,...,*tk*); if *a*(*f*) = 0 this reduces to the previous case.

In PA, this means that *S*(*t*) is a term, and that *t*<sup>1</sup> +*t*<sup>2</sup> ≡ +(*t*1,*t*2) and *t*<sup>1</sup> ×*t*<sup>2</sup> ≡ ×(*t*1,*t*2) are terms (provided *t*, *t*1, and *t*<sup>2</sup> are terms). For example, the constant 0 is a terms, and hence *S*(0) is a term, which one calls 1. Similarly, *Sn*(0) is a term called <sup>n</sup>, where e.g. *<sup>S</sup>*2(0) <sup>≡</sup> *<sup>S</sup>*(*S*(0)), etc.). From these, we can make terms n+ m, or n×*xi*, and subsequently (n+m)×(n×*xi*), enz.

In ZF, the only terms are /0 and the variables (as ZF lacks function symbols).

	- a. If *t*<sup>1</sup> and *t*<sup>2</sup> are terms, then *t*<sup>1</sup> = *t*<sup>2</sup> is a formula.
	- b. Any predicate symbol *P* and any set of *k* = *a*(*P*) terms (*t*1,...,*ta*(*<sup>P</sup>*)) jointly yield a formula *P*(*t*1,...,*ta*(*<sup>P</sup>*)); if *a*(*P*) = 0, then *P* is a formula by itself.
	- c. As in propositional logic: if ϕ and ψ are formulae, then so are ¬ϕ, ϕ ∨ψ, ϕ ∧ψ, and ϕ → ψ. What is new to first-order logic is that also ∃*x*ϕ and ∀*x*ϕ are formulae, for any variable *x* (which may or may not occur in ϕ).

In PA, the expression *t*<sup>1</sup> = *t*<sup>2</sup> is a formula (provided *t*<sup>1</sup> and *t*<sup>2</sup> are terms). In ZF, the expressions *t*<sup>1</sup> ∈ *t*<sup>2</sup> and *t*<sup>1</sup> = *t*<sup>2</sup> are formulae (if *t*<sup>1</sup> and *t*<sup>2</sup> are terms).


As in propositional logic, the axioms for predicate logic come in two groups: *purely logical axioms* and *domain specific axioms*. We will state the latter for the theories PA and ZF in §D.5 below, and now discuss the former (common to both). From propositional logic, we adopt (D.48) - (D.50), where α,β, γ,δ are arbitrary formulae. These are also *Axioms 1–3* of predicate logic, to which one adds:

*Axiom 4* : / (∀*x*ϕ(*x*)) → ϕ(*t*) for any term *t* (unless *x* occurs freely in ϕ through a subformula ∀*y*ψ where *y* occurs in *t*); some authors write ϕ(*x*/*t*) for ϕ(*t*).

$$\text{Axiom } \mathfrak{S} \;:\vdash (\forall\_x (\mathfrak{q} \to \mathfrak{w})) \to (\forall\_x \mathfrak{q} \to \forall\_x \mathfrak{w}).$$

*Axiom 6* : / ∀*x*(*x* = *x*).

*Axiom 7* : / ∀*x*∀*y*((*x* = *y*) → (ϕ(*x*) → ϕ(*y*))) for each formula ϕ that contains the variable *x freely* and contains *y* either *freely* or not at all.

	- 1. *Modus ponens*: / (ϕ → ψ) and / ϕ imply / ψ.
	- 2. *Universal generalization*: / ϕ(*x*) implies / ∀*x*ϕ(*x*).

These rules also apply to theories, provided that in the second, T / ϕ(*x*) implies T / ∀*x*ϕ(*x*) *provided no formula in* T *used in the proof of* ϕ *freely contains x*.


*Godel's Completeness Theorem ¨* (to be contrasted with his *in*completeness theorem, which roughly states that any first-order theory that incorporates PA contains undecidable sentences) generalizes Theorem D.10 and eq. (D.74) to first-order logic:

Theorem D.14. *A first-order theory* T *is consistent iff it has a model. In that case, a sentence* ϕ *of* T *is a theorem iff it is true in all models of* T *.*

Propositional logic is a special case of predicate logic, namely by assuming no variables, constants, and function symbols, and taking the atomic propositions (*p*1,...) to be predicate symbols with arity zero (or else {0,1}-valued variables). The rules of term formation in predicate logic then show that propositional logic has no terms, so that step 2.a above is empty, and step 2.b only yields the *pi*. These may be turned into compound expressions by the original uses of propositional logic, which in this case coincide with the rules of predicate logic (since there are no variables, ∃*x*ϕ and ∀*x*ϕ are both equivalent to ϕ). Finally, formulae coincide with sentences, since in the absence of variables, all formulae are closed.

As a transition to the next appendix, we continue our discussion on intuitionistic logic started in §D.3. The propositional fragment of first-order intuitionistic logic is still given by (D.56) - (D.65), in which the connectives ∧,∨,→, and ¬ (or ⊥) are independent. The equality sign = is treated with suspicion in intuitionism, and hence is omitted, whilst ∃ can no longer be defined in terms of ∀ through the classical identification of ∃*<sup>x</sup>* with ¬∀*x*¬. Instead, it is regulated by the two axioms

$$\vdash (\forall\_x (\mathfrak{q} \to \mathfrak{w})) \rightarrow (\exists\_x \mathfrak{q} \to \exists\_x \mathfrak{w});\tag{D.107}$$

$$\vdash \Phi(t) \rightarrow \exists\_{\mathbf{x}} \Phi(\mathbf{x}),\tag{\text{D.108}}$$

subject to the same proviso as Axiom 4 of the classical case, plus a deduction rule:

• ∃*-elimination*: / ∃*x*ϕ implies / ϕ (provided *x* is not free in ϕ).

This will be the logic on which the topos theory of the next chapter is based. Scary examples of intuituitionistically *invalid* rules involving ∀ and ∃ include:

$$
\neg \forall\_x \neg \mathfrak{P}(x) \leftrightarrow \exists\_x \mathfrak{P}(x);\tag{D.109}
$$

$$
\forall\_{\mathbf{x}} \neg \neg \mathfrak{P}(\mathbf{x}) \leftrightarrow \forall\_{\mathbf{x}} \mathfrak{P}(\mathbf{x});\tag{\mathbf{D}.110}
$$

$$
\neg \neg \exists\_x \mathfrak{g}(\mathfrak{x}) \leftrightarrow \exists\_x \neg \neg \mathfrak{g}(\mathfrak{x});\tag{\mathsf{D.111}}
$$

$$(\mathfrak{q} \to \exists\_{\mathbf{x}} \mathfrak{v}(\mathfrak{x})) \to \exists\_{\mathbf{x}} (\mathfrak{q} \to \mathfrak{w}(\mathfrak{x})),\tag{\mathsf{D}.112}$$

whereas useful intuituitionistically *valid* theorems containing ∀ and ∃ are, e.g.,

$$
\neg \exists\_{\mathbf{x}} \mathfrak{q}(\mathbf{x}) \leftrightarrow \forall\_{\mathbf{x}} \neg \mathfrak{q}(\mathbf{x});\tag{\mathsf{D}.113}
$$

$$
\neg\neg\forall\_{\mathfrak{x}}\mathfrak{g}(\mathfrak{x}) \leftrightarrow \forall\_{\mathfrak{x}}\neg\neg\mathfrak{g}(\mathfrak{x});\tag{\mathsf{D.114}}
$$

$$
\neg \neg \exists\_{\mathbf{x}} \mathfrak{g}(\mathbf{x}) \leftrightarrow \neg \forall\_{\mathbf{x}} \neg \mathfrak{g}(\mathbf{x}).\tag{\mathsf{D}.115}
$$

Godel's ¨ *negative translation* of classical logic to intuitionistic logic extends to firstorder logic: if, further to the manipulations mentioned after (D.92), one also replaces ∃*x*ϕ(*x*) by ¬∀*x*¬ϕ(*x*), then theorems ϕ of classical first-order logic are turned into theorem of intuitionstic first-order logic. Although we will not use it, we mention that the notion of a *Kripke model* also extends from propositional to predicate intuitionistic logic: compared to a classical model carried by a set *M*, as described above, we now have a *family* of (classical!) sets (*Mp*) indexed by some poset *P*, in which constants, functions, and predicate symbols are similarly interpreted as families [[*c*]]*Mx* <sup>∈</sup> *Mx*, ([[ *<sup>f</sup>* ]]*Mx* : *<sup>M</sup>*(*a*(*f*) *<sup>x</sup>* <sup>→</sup> *Mx*), and ([[*P*]]*Mx* <sup>⊂</sup> *<sup>M</sup>*(*a*(*P*) *<sup>x</sup>* ), such that if *x* ≤ *y*, then *Mx* ⊆ *My*, [[*c*]]*Mx* = [[*c*]]*My* , *G*([[ *f* ]]*Mx* ) ⊆ *G*([[ *f* ]]*My* ) (where *G*(*f*) is the graph of *f*), and [[*P*]]*Mx* ⊂ [[*P*]]*My* . Further to the forcing rules (D.101) - (D.106) for intuitionistic propositional logic, there are additional ones for ∃ and ∀, viz.

$$\text{ax} \Vdash \exists \mathfrak{q}(\mathfrak{x}) \text{ if there exists } m \in M\_{\mathfrak{x}} \text{ such that } \mathfrak{x} \Vdash [[\mathfrak{q}]]\_{M\_{\mathfrak{x}}}(m); \tag{D.116}$$

$$\text{Let } \Vdash \forall\_{\mathbf{x}} \mathfrak{g}(\mathbf{x}) \text{ if for all } \mathbf{y} \ge \mathbf{x} \text{ and all } m \in \mathcal{M}\_{\mathbf{y}} \text{ one has } \mathbf{y} \Vdash [[\![\![\![\mathbf{q}]\!]\!]\!]\_{\mathcal{M}\_{\mathbf{x}}}(m). \tag{D.117}$$

We will revisit these rules in topos theory, see §E.5; indeed, Kripke models for intuitionistic predicate logic emerge much more naturally in categorical language.

#### D.5 Arithmetic and set theory

Completing our running examples (for classical first-order logic), we now give the theories PA and ZF, starting with the axioms of *Peano Arithmetic*:

PA1 / ∀*x*(¬(*S*(*x*) = 0)); PA2 / ∀*x*∀*y*(*S*(*x*) = *S*(*y*) → *x* = *y*); PA3 / ∀*x*(*x*+0 = *x*); PA4 / ∀*x*∀*y*(*x*+*S*(*y*) = *S*(*x*+*y*)); PA5 / ∀*x*(*x*×0 = 0); PA6 / ∀*x*∀*y*(*x*×*S*(*y*)=(*x*×*y*) +*x*); PA7 (ϕ(0)∧(∀*x*(ϕ(*x*) → ϕ(*S*(*x*)))) → ∀*x*ϕ(*x*), for any formula ϕ(*x*).

Thinking of the variables in question as natural numbers (which is what Peano himself still did), these axioms obviously capture their properties pretty well and may require no further explanation (except perhaps the last one, which enables the proof technique of induction). The point, however, is that the axioms only form a *syntax*; the natural numbers N (as a set) themselves form a *model* of PA in the general sense discussed in the previous section (though by no means the only possible model, and hence N is called the *standard model* of PA). In particular, this means that the set N is assumed to be known (e.g. via ZF, see below), upon which the interpretation [[ϕ]]<sup>N</sup> of some formula ϕ in PA is determined by the rules given earlier. In particular:


According to the general definition, a sentence ϕ of PA is then called *true* in the given model (i.e., in the natural numbers) if [[ϕ]]<sup>N</sup> is true, in which case we write N ϕ. For example, [[∀*x*∀*y*(*x*+*y* = *y*+*x*)]]<sup>N</sup> means that for all natural numbers *x*, *y* ∈ N, one has *x*+*y* = *y*+*x* (which is true, isn't it). Another example is 1+1 = 2, which abbreviates *S*(0) +*S*(0) = *S*(*S*(0)). The interpretation of [[1+1 = 2]]<sup>N</sup> is given by 1+1 = 2 (which once again is true!). In particular, the above axioms of PA are true in this interpretation. The key conceptual point here is that (following Hilbert) one interprets a theory in a domain that is supposed to be known and consistent, so that it has its own methods of proof (for otherwise the semantic entailment symbol would be undefined). In this particular case, the domain is ZF set theory (or at least its lower echelons); see the comments to axioms ZF7 below.

It is quite instructive to see the crucial role of the seemingly technical axiom PA7. Suppose we try to define a model of PA in the set Q<sup>+</sup> of positive rational numbers (including zero), so that <sup>∀</sup>*<sup>x</sup>* means "for all *<sup>x</sup>* <sup>∈</sup> <sup>Q</sup>+", and <sup>∃</sup>*<sup>x</sup>* stands for "there exists *<sup>x</sup>* <sup>∈</sup> <sup>Q</sup>+"; the number zero (as the interpretation of the constant <sup>0</sup>) and the functions *S*, +, and × have their usual meaning, however. Then all of PA1–PA6 hold, but PA7 fails, and hence the given interpretation of PA in Q<sup>+</sup> is not a model of PA.

The axioms of ZF are a trifle more complex than those of PA, but then they are supposed to describe all of mathematics! We use the following abbrevations:

$$
\forall\_{\mathbf{x},\mathbf{y}} \equiv \forall\_{\mathbf{x}} \forall\_{\mathbf{y}};\tag{\mathbf{D}.118}
$$

$$
\alpha \leftrightarrow \beta \equiv (\alpha \to \beta) \land (\beta \to \alpha); \tag{D.119}
$$

$$
\mathbf{x} \neq \mathbf{y} \equiv \neg(\mathbf{x} = \mathbf{y});\tag{\text{D.120}}
$$

$$
\mathbf{x} \notin \mathbf{y} \equiv \neg(\mathbf{x} \in \mathbf{y}).\tag{\mathbf{D}.121}
$$

Other notation of ZF will be explained in the text following the axioms, which are:

ZF1 / ∀*x*,*<sup>y</sup>* ((∀*z*(*z* ∈ *x* ↔ *z* ∈ *y*)) ↔ *x* = *y*) (*Extensionality*) ZF2 / ∀*x*∃*y*∀*z*(((*z* ∈ *x*)∧ϕ(*z*)) ↔ *z* ∈ *y*) (*Separation*) ZF3 / ¬∃*xx* ∈ 0 ( / *Empty set*) ZF4 / ∀*v*,*w*∃*y*∀*z*(*z* ∈ *y* ↔ (*z* = *v*)∨(*z* = *w*)) (*Pairing*) ZF5 / ∀*x*∃*y*∀*z*(*z* ∈ *y* ↔ ∃*w*∈*xz* ∈ *w*) (*Union*) ZF6 / ∀*x*∃*y*∀*z*(*z* ∈ *y* ↔ *z* ⊂ *x*) (*Power set*) ZF7 / ∃*x*(0/ <sup>∈</sup> *<sup>x</sup>*∧ ∀*y*(*<sup>y</sup>* <sup>∈</sup> *<sup>x</sup>* <sup>→</sup> *<sup>y</sup>*<sup>+</sup> <sup>∈</sup> *<sup>x</sup>*)) (*Infinity*) ZF8 / ∀*u*((∀*x*∈*u*∃!*z*ϕ(*x*,*z*)) → ∃*y*∀*z*(*z* ∈ *y* ↔ ∃*x*∈*u*ϕ(*x*,*z*))) (*Replacement*) ZF9 / ∀*v*=0/ ∃*x*∈*<sup>v</sup>* ∀*y*(*y* ∈ *x* → *y* ∈/ *v*) (*Regularity*) AC / ∀*u*∃*w*((*w* ⊂ P(*u*)×*u*)∧(∀*x*∈P(*<sup>u</sup>*)(*x* = 0/ → ∃!*y*∈*<sup>x</sup>* < *x*, *y* >∈ *w*))) (*Choice*)

In ZF2 and ZF8, ϕ(·) is an arbitrary formula with at least the specified free variables, so that these axioms are more properly thought of as *axiom schemes*.

These axioms have been the subject of entire monographs, but we will be brief here. All intuition about the axioms comes from "naive" sets, although the whole point should be that the axioms stand on their own, and circumvent the problem of defining sets conceptually (as Frege and Cantor desperately tried to do, much as Euclid tried to define in vain what a point is, before he was was liberated by Hilbert). The axioms may be put into two groups: Axioms ZF1, ZF3, ZF9, and AC are concerned with *given* sets, whereas nos. ZF2, ZF4, ZF5, ZF6, ZF7, and ZF8 regulate the way *new* sets may be constructed from old ones. Here are some comments on the axioms one by one (which should, however, be seen as a whole).

ZF1 states that a set is determined by its members (which themselves are sets!).

ZF2 is a correct version of the naive idea of Cantor, Dedekind, and Frege that every property (or predicate) defines a set. If we look at a predicate as a formula ϕ(*z*) stating that *z* has a certain property, the naive idea of these gentlemen was that *y* = {*z* | ϕ(*z*)} is a set. This idea would be secured by the axiom

$$
\exists\_{\mathbf{y}} \forall\_{z} (\mathfrak{g}(z) \leftrightarrow z \in \mathfrak{y}), \tag{\mathsf{D}.122}
$$

which however leads to Russell's Paradox (in which ϕ(*z*) ≡ *z* ∈/ *z*).

The crucial difference between ZF2 and this naive version is that in ZF one restricts set formation to those *z* that satisfy ϕ(*z*) *and are a member of some set x that is already given*. By ZF1, the set *y* defined by ZF2 is unique; it is written as

$$\mathbf{y} \equiv \{ z \in \mathfrak{x} \mid \mathfrak{g}(z) \}. \tag{\text{D.123}}$$

This notation introduces the familiar brackets {··· } from naive set theory, which are therefore derived concepts not belonging to the notation of ZF. This is also true for most of the other symbols from naive set theory (except ∈, which is a predicate symbol in ZF). For example, for arbitrary "sets" *x* and *v* (which so far are really just variables in ZF), we introduce *x*∩*v* as a name (i.e., an abbreviation) for the set *y* defined by taking ϕ(*z*) in ZF2 to be *z* ∈ *v*. Using the notation (D.123), this *defines* the symbol ∩ (for "intersection") by

$$\alpha \cap \nu \equiv \{ z \in x \mid z \in \nu \}. \tag{D.124}$$

ZF3 states that /0, which was the only constant in ZF, has no elements. According to ZF1 this set is unique, so that /0 may be thought of as *the* empty set. In particular, ZF3 implies that there are sets in the first place (instead of defining it as a constant, one could alternatively introduce the symbol /0 at this stage). An equivalent form of ZF3 is: / ∀*x*¬(*x* ∈ 0/), also written as ∀*xx* ∈/ 0./

ZF4 states that for given sets *v* and *w*, there exists a set *y* with exactly those two members. We write this *y* as *y* = {*v*,*w*}, which uses brackets {··· } consistently: in ZF2 take ϕ(*z*) to be (*z* = *v*)∨(*z* = *w*) and take *x* to be the *y* just considered. This may be iterated, so that we may write {*x*1,..., *xn*} for the set *y* that satisfies

$$\forall \mathbf{x}\_1, \dots, \mathbf{x}\_n \exists \mathbf{y} \forall z (z \in \mathbf{y} \leftrightarrow (z = \mathbf{x}\_1) \lor \dots \lor (z = \mathbf{x}\_n));\tag{\mathbf{D.125}}$$

this set is unique by ZF1. Using the notation from ZF2 we may then write

$$\{\mathbf{x}\_1, \dots, \mathbf{x}\_n\} \equiv \{z \in \mathbf{y} \mid (z = \mathbf{x}\_1) \vee \dots \vee (z = \mathbf{x}\_n)\}.\tag{\mathbf{D}.126}$$

ZF5 postulates the existence of a set *y* whose elements are the elements of *x*. In this axiom, the generic notation

$$\exists\_{\mathbf{w}\in\mathbf{x}}\Psi\equiv\exists\_{\mathbf{w}}((w\in\mathbf{x})\wedge\Psi),\tag{\mathbf{D.127}}$$

is used, where ψ is some formula, which in ZF5 is *z* ∈ *w*. We write *y* = ∪*x*, which *defines* the symbol ∪, i.e.,

$$\mathbb{L}\cup\mathfrak{x}\equiv\{z\in\mathfrak{y}\mid\exists\_{\mathfrak{w}\in\mathfrak{x}}\mathcal{Z}\in\mathcal{w}\},\tag{\text{D.128}}$$

where *y* = ∪*x* is the set whose existence is guaranteed by ZF5. In the special case *x* = {*x*1,..., *xn*}, we write

$$\{\mathbf{x}\_1 \cup \dots \cup \mathbf{x}\_n \equiv \cup \{\mathbf{x}\_1, \dots, \mathbf{x}\_n\}.\tag{\text{D.129}}$$

ZF6 calls for each *x* to have a power set *y*. The notation

$$z \subset x \equiv \forall\_{\mathbf{y}} (\mathbf{y} \in z \to \mathbf{y} \in x), \tag{\mathbf{D}.130}$$

*defines* the symbol ⊂; note that *z* = *x* is allowed, so that ⊂≡⊆. As usual, the set *y* is unique due to ZF1, and is denoted by P(*x*), whose elements are therefore the subsets *z* of *x*. We may write this a la (D.123) as ( ` *y* being the set from ZF6):

$$\mathcal{O}^{\mathfrak{p}}(\mathfrak{x}) \equiv \{ z \in \mathfrak{y} \mid z \subset \mathfrak{x} \}. \tag{\text{D.131}}$$

ZF7 postulates the existence of a set *y* whose elements are

$$\emptyset, \emptyset, \emptyset^{+} = \{\emptyset\}, \{\emptyset\}^{+} = \{\emptyset, \{\emptyset\}\}, \{\emptyset, \{\emptyset\}\}^{+} = \{\emptyset, \{\emptyset\}, \{\emptyset, \{\emptyset\}\}\}, \dots \quad (\text{D.132})$$

in which the notation

$$\mathbf{y}^+ \equiv \bigcup \{ \mathbf{y}, \{ \mathbf{y} \} \} = \mathbf{y} \cup \{ \mathbf{y} \},\tag{\text{D.133}}$$

is underwritten by ZF5. Hence the elements of *y*<sup>+</sup> are the elements of *y*, supplemented with the single element *y*. Following von Neumann, the sets in (D.132) are called 0˙,1˙,2˙,3˙,..., respectively, where 0 is identified with the empty set, and ˙ *n* > 0 is realized in a very specific way. Thus ZF7 states the existence of a set containing 0˙,1˙,2˙,3˙,.... The intersection of all sets with this property is the smallest set containing 0˙,1˙,2˙,3˙,...; this is the smallest infinite set, called ω. In the standard model of ZF (see below), ω is (a copy of) the set N of natural numbers.

ZF8, in which ϕ should not contain *y*, states that if some formula ϕ(*x*,*z*) assigns exactly one *z* to any given *x*, then these *z* form a set, provided the variables *x* form a set (i.e., *u*). Such a formula ϕ is really a function *f* so that *f*(*x*) = *z*, and hence this axioms states that the image of any set under some function is again a set. Using the notation (D.123), we then have

$$f(\mu) = \{ z \in \mathfrak{y} \mid \exists\_{\mathbf{x} \in \mathfrak{u}} \mathfrak{g}(\mathbf{x}, z) \}. \tag{D.134}$$

ZF9 is the most contrived axiom in ZF, stating that every nonempty set *v* contains some element *x* disjoint from *x*. Its formulation uses the generic abbreviation

$$\forall\_{\nu \neq \emptyset} \Psi \equiv \forall\_{\nu} ((\exists\_{\boldsymbol{z}} \boldsymbol{z} \in \boldsymbol{\nu}) \to \boldsymbol{\Psi}) \tag{D.135}$$

Using the symbol ∩ from (D.124), one easily checks that ∀*y*(*y* ∈ *x* → *y* ∈/ *v*) is the same as *x*∩*v* = 0, in terms of which / ZF9 reads

$$\vdash \forall\_{\boldsymbol{\nu} \neq \boldsymbol{\Psi}} \exists\_{\boldsymbol{\kappa} \in \boldsymbol{\nu}} \left( \boldsymbol{x} \cap \boldsymbol{\nu} = \boldsymbol{\emptyset} \right). \tag{D.136}$$

This implies *x* ∈/ *x*, which avoids all kinds of paradoxes (though not Russell's, which was taken care of by ZF2). Moreover, ZF9 enables transfinite induction.

AC warrants the choice of an element of each nonempty subset of any set. Indeed, rewriting the expression ∃!*y*∈*x*(*x*, *y*) ∈ *w* as ∃!*y*∈*u*((*x*, *y*) ∈ *w*∧*y* ∈ *x*), AC reads

$$\vdash \forall\_{\boldsymbol{\mu}} \exists\_{\boldsymbol{w}} ((\boldsymbol{w} \subset \mathcal{\mathcal{P}}(\boldsymbol{u}) \times \boldsymbol{\mu}) \land (\forall\_{\boldsymbol{x} \in \mathcal{\mathcal{P}}(\boldsymbol{u})} (\boldsymbol{x} \neq \boldsymbol{0} \to \exists!\_{\mathbf{y} \in \boldsymbol{u}} (\mathbf{x}, \mathbf{y}) \in \boldsymbol{w} \land \mathbf{y} \in \mathbf{x}))).\tag{D.137}$$

As we shall see shortly, this shows that there exists a function

$$f: \mathcal{P}(\mathfrak{u}) \longrightarrow \mathfrak{u} \tag{\text{D.138}}$$

that maps *x* ∈ P(*u*) to *f*(*x*) ∈ *u*, such that ∀*x*∈P(*<sup>u</sup>*)(*x* = 0/ → *f*(*x*) ∈ *x*). Although ∃*<sup>f</sup>* is undefined in (first-order) ZF, one may therefore *informally* rewrite AC as

$$\forall\_{\mu} \exists\_{f \colon \beta^{\mathfrak{p}}(u) \longrightarrow u} \forall\_{\mathfrak{x} \in \beta^{\mathfrak{p}}(u), \mathfrak{x} \neq \emptyset} f(\mathbf{x}) \in \mathbf{x}.\tag{D.139}$$

We now formally define *functions*, which, as already noted, are curiously absent in ZF (which lacks function symbols). This relies on the following theorem of ZF:

$$\vdash \forall\_{\mathfrak{u},\mathbf{v}} \forall\_{\mathbf{x},\mathbf{y}} \left( (\mathbf{x} \in \mathfrak{u}) \land (\mathbf{y} \in \mathbf{v}) \right) \to \{ \{\mathbf{x}\}, \{\mathbf{x},\mathbf{y}\} \} \in \mathcal{P}(\mathcal{P}(\mathfrak{u} \cup \mathbf{v})).\tag{\mathsf{D}.140}$$

We now introduce the abbreviation

$$ \equiv \{ \{x\}, \{x,y\} \},\tag{D.141}$$

which by (D.140) is an element of the double power set P(P(*u*∪*v*)) (assuming that *x* ∈ *u* and *y* ∈ *v*); this notation makes < *x*, *y* > an *ordered* pair, as opposed to {*x*, *y*} = {*y*, *x*}. The *(cartesian) product* of two sets *u* and *v* is now defined as the set

$$\mu \times \nu \equiv \{ \mathbf{z} \in \mathcal{P}(\mathcal{P}(\mu \cup \nu)) \mid \exists\_{\mathbf{x} \in \mu} \exists\_{\mathbf{y} \in \nu} \mathbf{z} = <\mathbf{x}, \mathbf{y} > \}, \tag{\mathbf{D.142}}$$

i.e., in ZF2 we substitute *x* P(P(*u*∪*v*)) as well as

$$\mathfrak{sp}(z) \leadsto \exists\_{\mathfrak{x} \in \mathfrak{u}} \exists\_{\mathfrak{y} \in \mathbb{V}} z = <\mathfrak{x}, \mathfrak{y} >, \tag{\text{D.143}}$$

and denote the (unique) set *y* thus defined by *u*×*v*. Informally, one often writes

$$
\mu \times \nu = \{<\mathbf{x}, \mathbf{y} > |\mathbf{x} \in \mathfrak{u}, \mathbf{y} \in \nu\}.\tag{\text{D.144}}
$$

We are now in a position to define functions in ZF set theory:

Definition D.15. *A* function *f* : *u* → *v is a subset Gf* ⊂ *u*×*v for which*

$$\forall\_{\mathbf{x}\in\boldsymbol{\mu}}\exists!\_{\mathbf{y}\in\boldsymbol{\nu}} < \mathbf{x}, \mathbf{y} > \boldsymbol{\in}\mathcal{G}\_f. \tag{D.145}$$

Here ∃!*y*∈*v*ψ(*y*) abbreviates

$$\exists\_{\mathbf{y}} ((\mathbf{y} \in \mathbf{v}) \land (\forall\_z (\mathbf{w}(z) \leftrightarrow z = \mathbf{y}))),\tag{\mathbf{D}.146}$$

cf. (D.127)), which yields (D.145) upon the substitution ψ(*y*) < *x*, *y* >∈ *Gf* . More generally, one has

$$
\exists!\_\mathbf{y} \Psi(\mathbf{y}) \equiv \exists\_\mathbf{y} \forall\_z (\Psi(z) \leftrightarrow z = \mathbf{y}).\tag{\mathbf{D}.147}
$$

Hence in ZF set theory a function *f* is defined by (or even identified with) its graph *Gf* , which closes the historical circle: Newton clearly looked at what we now call functions through their graphs, upon which Euler began to assign some value *f*(*x*) ∈ *v* to *x* ∈ *u* (though always through some concrete prescription). The 19th century brought the abstract idea of a function as a map between sets, which, as we just saw, ZF set theory replaced by the view that a function is defined by its graph.

Compared to the standard interpretation of PA in the natural numbers, which was a special case of the general notion of a model described in §D.4, the *standard model of* ZF is unusual, in that its carrier is not a set (but a so-called *class*), called the *settheoretic universe* (or *cumulative hierarchy*) V, whose construction was first given by none other than von Neumann, whose name already pervaded this book. We will not go into the details of this construction except by noting that—much as the natural numbers may be built from zero by repeated use of the successor function *S*—the universe V is constructed from the empty set /0 by "repeated" use of:


However, what is really meant here by "repeated" defies imagination (and may drive one crazy); fortunately, most of mathematics only uses the lower echelons of V.

Furthermore, interpreting the constant /0 by the usual empty set (with the same name), the interpretation ε of ∈ in V needs to be defined. This is done as follows:


Here *V*, *Y*, and *Z* are sets in V. Applying these rules "iteratively" (see, however, the above comment on "repeated"), for all sets *X* and *Y* in V, it can in principle be established whether or not *X*ε*Y*, so that the symbol ε is defined within V. Having access to the universe V, ε, and the empty set /0, one may then define the interpretation [[ϕ]]<sup>V</sup> of some formula ϕ of ZF in V by the following rules (cf. PA and N):


A sentence ϕ in ZF is then *true*, denoted by V ϕ, if [[ϕ]]<sup>V</sup> is true. For example, all axioms of ZF are true in this interpretation (which is by no means trivial!).

In particular, in this model we interpret ˙*n* (see the explication of ZF7 above) as the *n*-fold iteration of the successor operation to /0, i.e., ˙*n* = 0/+···<sup>+</sup> (with *n* pluses), seen as an element of V, and recover the standard model of the natural numbers (and hence the carrier of the standard interpretation of PA) as N = ∪*nn*˙, which is the intersection of all sets in V that contain all sets ˙*n* (for any finite *n*).

Notes 803

#### Notes

The "modernist" transformation of mathematics led by Hilbert, including its complete prehistory and aftermath, is delightfully described in Gray (2008). The revolutionary nature of Hilbert's views, which started with his influential book *Grundlagen der Geometrie* from 1899, is nowhere clearer than from his correspondence with Frege (cf. Gabriel et al, 1980), who, though one of the fathers of the formalisation of mathematics (specifically through first-order logic), infuriated Hilbert by stating that the latter did not bother to define the notions of "point" or "line" because Hilbert assumed these to be familiar to his readers. But no, quite to the contrary:

'Hier liegt wohl der Cardinalpunkt des Misverstandnisses ( ¨ . . . ) Ich will nichts als bekannt voraussetzen (. . . ) Wenn ich unter meinen Punkten irgendwelche Systeme von Dingen, z.B. das System: Liebe, Gesetz, Schornsteinfeger ..., denke und dann meine samtlichen Ax- ¨ iome als Beziehungen zwischen diesen Dingen annehme, so gelten meine Satze, z.B. der ¨ Pythagoras, auch von diesen Dingen.' (Hilbert to Frege, 29-12-1899).1

This may be an exaggeration, however. Einstein probably came closer to the truth:

'An dieser Stelle nun taucht ein Ratsel auf, das Forscher aller Zeiten so viel beunruhigt hat. ¨ Wie ist es moglich, daß die Mathematik, die doch ein von aller Erfahrung unabh ¨ angiges ¨ Produkt des menschlichen Denkens ist, auf die Gegenstande der Wirklichkeit so vortre- ¨ fflich paßt? Kann denn die menschliche Vernunft ohne Erfahrung durch bloßes Denken Eigenschaften der wirklichen Dinge ergrunden? ¨

Hierauf ist nach meiner Ansicht kurz zu antworten: Insofern sich die Satze der Mathematik ¨ auf die Wirklichkeit beziehen, sind sie nicht sicher, und insofern sie sicher sind, beziehen sie sich nicht auf die Wirklichkeit.' (Einstein, 1921).2

The great irony is that Hilbert's call for abstraction, which at first sight decoupled mathematics from its origins in physics and other applications, in fact very rapidly led to the deepest applications of mathematics to physics so far, such as the use of (pseudo) Riemannian geometry in general relativity, and the use of Hilbert (!) spaces and operator algebras in quantum mechanics. In the present book, a high point of this paradox is the use of Grothendieck toposes (cf. Appendix E) in quantum mechanics (see Chapter 12), especially because Grothendieck himself almost made a sport of extreme abstraction, partly motivated by internal mathematical needs in algebraic geometry, but undoubtedly also by his indignation about the use of (mathematical) physics for military purposes (which put him diametrically against von Neumann).

<sup>1</sup> This is surely the central point of the misunderstanding (. . . ) I do not want to assume anything as known (. . . ) If I interpret my notions by arbitrary things, for example, by the system: love, law, chimney sweeper, and subsequently interpret my axioms as relations between these things, then my theorems, like the one of Pythagoras, hold about these things. (Translation by the author)

<sup>2</sup> At this point an enigma presents itself, which in all ages has agitated inquiring minds. How can it be that mathematics, being after all a product of human thought which is independent of experience, is so admirably appropriate to the objects of reality? Is human reason, then, without experience, merely by taking thought, able to fathom the properties of real things?

In my opinion the answer to this question is, briefly, this: as far as the propositions of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality. (Translation: Sonja Bargmann)

## §D.1. Order theory and lattices

For lattice theory in general and Stone's Theorem see Givant & Halmos (2009), Davey & Priestley (2002), and Johnstone (1982). For (D.36) - (D.37) see Theorem 33 in Chapter 35 of Givant & Halmos (2009).

## §D.2. Propositional logic

Halmos & Givant (1998) is an elementary exposition of the connection between Boolean lattices and logic. Other useful (propositional as well as first-order) logic texts include Bell & Machover (1977), Johnstone (1987), Kaye (2007), and Mendelson (2010).

## §D.3. Intuitionistic propositional logic

Key writings on intuitionism (at least from the Dutch school) include Brouwer (1907, 1918, 1975), Heyting (1956) and Troelstra & van Dalen (1988). See also Dummett (2000) for a view from abroad. Our treatment of Kripke models for intuitionisistic propositional logic is taken from Goldblatt (1984) and Palmgren (2009).

## §D.4. First-order (predicate) logic

For the history of first-order logic see Grattan-Guinness (2000) and Mancosu, Zach, & Badesa (2004), plus innumerable books about Frege, Russell, Hilbert, etc. It is regrettable that the close companionship of mathematics and philosophy at the time, whose cross-fertilization has given us both the modern foundations of mathematics on the one hand and analytic philosophy on the other, has not lasted.

§D.5. Arithmetic and set theory For PA see e.g. Kaye (1991), which focuses on non-standard models. The bible of ZF set theory is Jech (2006).

## Appendix E Category theory and topos theory

This appendix gives a brief introduction to category theory, moving towards the particular categories that are of interest to quantum theory (viz. categories of presheaves and sheaves) as quickly as possible (but not more quickly). However, even the basic setup of category theory is already relevant for e.g. the conceptually most satisfactory formulation of Gelfand duality, as described below Theorem C.23 (see also Theorem C.45), and likewise of Stone duality, see Theorem D.5. Otherwise, this material will only be used in Chapter 12 on quantum logic. We omit most proofs.

*Categories* were originally introduced by Eilenberg & Mac Lane (1945) in order to define *natural transformations*, through which they formalized (and explained) the intuition that certain isomorphism in mathematics are "natural" or "canonical" (like the one between the second dual *V*∗∗ of a finite-dimensional vector space *V* and *V* itself, as opposed to the isomorphism between *V*∗ and *V*). Natural transformations are predicated on *categories* and *functors*, i.e. maps between categories, which are analogous to continuous functions between topological spaces, and in turn give rise to new categories, similarly to functions giving rise to function spaces in functional analysis. Initially meant to organize certain fields of mathematics in a systematic way (such as algebraic topology and homological algebra), categories soon became objects of study in their own right. As such, the basic vocabulary of category theory is completed by defining *adjoint functors* (invented by Kan in 1958) and *(co)limits*.

*Toposes* are categories with enough structure to support the interpretation of firstorder (and even higher-order) intuitionistic logic, similar to set theory providing semantics for classical predicate logic, which in turn generalizes the relationship between propositional logic and Boolean algebra, cf. §D. In this respect, the presence of a *truth object* (i.e. *subobject classifier*) partly explains their potential relevance to quantum mechanics. However, toposes were introduced in the 1960s by Grothendieck from a completely different motivation, namely algebraic geometry, and were originally seen by him as generalizations of topological spaces. This aspect plays an equally important role for quantum mechanics, and hence we quote:

'A startling aspect of topos theory is that it unifies two seemingly wholly distinct mathematical subjects: on the one hand, topology and algebraic geometry, and on the other hand, logic and set theory.' (Mac Lane & Moerdijk, 1992, p. 1).

#### E.1 Basic definitions

The definition of a category emphasizes the idea that one is at least as interested in the maps between objects as in the objects themselves. The only complication (which we ignore) is the uses of *classes*; categories are often too big to be sets, and hence they require an axiomatization of mathematics different from standard ZF set theory (such as von Neumann–Bernays–Godel set theory or algebraic set theory). ¨

Definition E.1. *A* category C = (C1,C0,*i*,*s*,*t*,*m*) *consists of:*


$$\mathcal{C}\_1 \times\_{\mathbb{C}\_0} \mathcal{C}\_1 = \{(f, \mathbf{g}) \in \mathcal{C}\_1 \times \mathcal{C}\_1 \mid \mathbf{s}(f) = \mathbf{t}(\mathbf{g})\},\tag{\mathcal{E}.1}$$

*such that, writing f g* ≡ *m*(*f*,*g*) *and* id*<sup>x</sup>* ≡ *i*(*x*)*,*

$$\mathbf{s}(fg) = \mathbf{s}(\mathbf{g});\tag{\mathbf{E.2}}$$

$$\mathfrak{t}(f\mathfrak{g}) = \mathfrak{t}(f);\tag{\mathbb{E}.3}$$

$$(fg)h = f(gh);\tag{E.4}$$

$$\mathbf{s(id\_{x}) = t(id\_{x}) = x;} \tag{E.5}$$

$$f\text{id}\_{s(f)} = \text{id}\_{t(f)}f = f. \tag{E.6}$$

Note that (E.4) is well defined by virtue of (E.2) - (E.3). We often write *<sup>x</sup> <sup>f</sup>* → *y* or *f* : *x* → *y* or, even better in principle but cumbersome in practice (see below), *y f* ← *x*, when *f* ∈ C<sup>1</sup> satisfies *s*(*f*) = *x* and *t*(*f*) = *y*, and interpret *f* as an arrow from *x* to *y*, so that id*<sup>x</sup>* is an arrow from *x* to *x*. Composition *f* ◦ *g* ≡ *f g* of arrows is defined whenever *s*(*f*) = *t*(*g*) (so that on paper the preferred direction of an arrow is from right to left!). Arrow composition is associative whenever defined, and each *i*(*x*) acts as an identity under this composition operation. The class of all arrows from *x* to *y* in a category C is sometimes written as HomC(*x*, *y*), or simply as Hom(*x*, *y*), when C is unambiguous. A category is called *small* if both C<sup>0</sup> and C<sup>1</sup> are sets (otherwise, a category is called *large*), and *locally small* if for each *x*, *y* ∈ C<sup>0</sup> the class HomC(*x*, *y*) is a set (although C<sup>1</sup> itself may be a proper class). All categories used in this book are locally small (though not necessarily small). Here are some examples.


Categories come with an intrinsic notion of isomorphism: one calls two objects *x*, *y* ∈ C<sup>0</sup> *isomorphic*, written *x* ∼= *y*, when there exist arrows *f* : *x* → *y* and *g* : *y* → *x* such that *f g* = id*<sup>y</sup>* and *g f* = id*x*. For example, two sets are in bijective correspondence iff they are isomorphic objects in Sets, two topological spaces are homeomorphic iff they are isomorphic in the category of topological spaces and continuous maps, and two C\*-algebras are isomorphic in the sense of Definition C.2 iff they are isomorphic in CA, where we define the following useful categories of C\*-algebras:


Here the notion of a *subcategory* C ⊂ D is the obvious one, i.e. C<sup>0</sup> ⊂ D0, C<sup>1</sup> ⊂ D1, and C is a category by itself (in particular, C is closed under the maps *s*,*t*,*i*,*m*). We say that C is a *full subcategory* of D if HomC(*x*, *y*) = HomD(*x*, *y*) for all *x*, *y* ∈ C0.

We now define the "canonical" maps between categories (which, in the spirit of the subject, are often more important than the underlying categories themselves!).

Definition E.2. *Let* C *and* D *be categories. A* covariant functor *or simply* functor *F* : C → D *consists of a pair of maps Fi* : C*<sup>i</sup>* → D*i, i* = 0,1*, such that:*

$$i\_{\mathbb{D}} \circ F\_0 = F\_{\mathbb{D}} \circ i\_{\mathbb{C}};\tag{E.7}$$

$$\mathbf{s\_D} \circ F\_1 = F\_0 \circ \mathbf{s\_C};\tag{\mathcal{E}.8}$$

$$t\_{\mathbb{D}} \circ F\_1 = F\_0 \circ t\_{\mathbb{C}};\tag{E.9}$$

$$F\_1(fg) = F\_1(f)F\_1(g) \ (f, g \in \mathbb{C}\_2),\tag{E.10}$$

*where i*<sup>D</sup> : D<sup>0</sup> → D<sup>1</sup> *is the inclusion map in* D*, etc.*

*A* contravariant functor *F* : C → D *is a pair Fi* : C*<sup>i</sup>* → D*i, i* = 0,1*, such that:*

$$
\dot{\imath}\_{\mathsf{D}} \circ F\_{\mathsf{D}} = F\_{\mathsf{I}} \circ \dot{\imath}\_{\mathsf{C}};\tag{\mathsf{E}.11}
$$

$$s\_{\mathbb{D}} \diamond F\_{\mathbb{I}} = F\_0 \diamond t\_{\mathbb{C}};\tag{\mathbb{E}.12}$$

$$t\_{\mathbb{D}} \circ F\_{\mathbb{I}} = F\_0 \circ s\_{\mathbb{C}};\tag{\mathbb{E}.13}$$

$$F\_1(fg) = F\_1(g)F\_1(f) \ (f, g \in \mathbb{C}\_2). \tag{E.14}$$

It follows that *F*<sup>0</sup> is determined by *F*1, since *i* is injective, but nonetheless it is useful to keep them apart. The use of contravariant functors may be avoided by introducing the *opposite category* Cop of C, which has the same objects and arrows as C, but the latter going in the opposite direction (i.e. *s*Cop = *t*C, etc.). For example, if C = X is a preorder, in the category Xop the partial order is reversed. A contravariant functor *<sup>F</sup>* : <sup>C</sup> <sup>→</sup> <sup>D</sup> is then obviously the same thing as a covariant functor *<sup>F</sup>* : <sup>C</sup> <sup>→</sup> <sup>D</sup>op, or, equivalently, *<sup>F</sup>* : <sup>C</sup>op <sup>→</sup> <sup>D</sup>. This is very important for us, because Gelfand duality is based on *contravariant* functors and hence on *opposite* categories; see below.

Definition E.3. *A* natural transformation *between two functors F* : C → D *and G* : C → D *(that are either both covariant or both contravariant) is a map* τ : C<sup>0</sup> → D1*, written x* → τ*x, such that s*D(τ*x*) = *F*0(*x*) *and t*D(τ*x*) = *G*0(*x*)*— in other words,* τ *is a collection of maps* τ*<sup>x</sup>* : *F*0(*x*) → *G*0(*x*) *indexed by x* ∈ C0*—such that the following diagram commutes for all arrows f* : *x* → *y:*

*F*0(*x*) *G*0(*x*) *F*0(*y*) *G*0(*y*) τ*x F*1(*f*) *G*1(*f*) τ*y* (E.15)

*Two functors F and G as above are called* naturally isomorphic*, written F* ∼= *G, when there exists a natural transformation* τ *between them for which all arrows* τ*<sup>x</sup> are invertible (i.e., are isomorphisms).*

It follows that if *F* and *G* are naturally isomorphic, then *F*0(*x*) ∼= *G*0(*x*) for all *x* ∈ C0, but this condition is not sufficient by itself to render *F* and *G* naturally isomorphic, for the isomorphisms τ*<sup>x</sup>* between *F*(*x*) and *G*(*x*) must be compatible with the arrows, as expressed by the diagram in the above definition; this is even the whole point!

Definition E.3 clarifies the idea that the double dual *V*∗∗ of any finite-dimensional vector space *V* is isomorphic to *V* in a "natural" way: namely, the functor ∗∗ from the category of finite-dimensional vector spaces (over C) to itself (with linear maps as arrows) is naturally isomorphic to the identity functor through the natural transformation whose components τ*<sup>V</sup>* : *V* → *V*∗∗ are given by the "Gelfand transform" *v* → *v*ˆ, where ˆ*v*(θ) = θ(*v*) for θ ∈*V*∗. In contrast, the dual *V*<sup>∗</sup> is isomorphic to *V* in an "unnatural" way, in that any isomorphism depends on the choice of a basis.

Definition E.4. *Two categories* C,D *are called* equivalent*, written* C ) D*, when there exist (covariant) functors F* : C → D *and G* : D → C *such that F* ◦ *G* ∼= id<sup>D</sup> *and G* ◦*F* ∼= idC*. Similarly,* C *and* D *are called* dual *when there exist* contravariant *functors with the same properties, i.e., if* C *and* Dop *are equivalent.*

Here id<sup>C</sup> is the identity functor on C, etc. Spelling out what this means, using Definition E.3, yields the commutative diagrams

$$\begin{array}{c} G\_0 \circ F\_0(\mathbf{x}) \xrightarrow{\mathsf{r}\_\mathbf{x}} \mathbf{x} \\\\ G\_0 \circ F\_0(\mathbf{y}) \xrightarrow{\mathsf{r}\_\mathbf{y}} \mathbf{y} \end{array} \tag{\mathcal{E}.16}$$

for all *f* : *x* → *y* in C1, where each τ*<sup>x</sup>* is invertible, and also for all *f* : *x* → *y* in D1,

$$\begin{array}{c} F\_0 \circ G\_0(\mathbf{x'}) \xrightarrow{\tau'\_{\mathbf{x'}}} \mathbf{x'}\\ F\_1 \circ G\_1(\mathbf{y'})\\ F\_0 \circ G\_0(\mathbf{y'}) \xrightarrow{\tau'\_{\mathbf{y'}}} \mathbf{y'} \end{array} \tag{E.17}$$

We are now in a position to give a categorical (re)formulation of Gelfand duality. Further to the categories of commutative C\*-algebras CCA1, CCAn, and CCAm defined earlier in this section, this involves the following categories of spaces:


Theorem E.5. *There are categorical equivalences (i.e., dualities if 'op' is omitted):*

$$\mathsf{CCA}\_{\mathsf{I}} \simeq \mathsf{C}\mathsf{H}^{\mathsf{op}};\tag{\mathsf{E}.18}$$

$$\mathsf{CCAn} \cong \mathsf{LCMp}^{\mathsf{op}};\tag{\mathsf{E.19}}$$

$$\mathsf{CCAm} \simeq \mathsf{LCH}^{\mathrm{op}}.\tag{\mathsf{E.20}}$$

*Proof.* In the proof of Theorem C.23, the maps ev*<sup>X</sup>* provide a natural isomorphism between the functors idCH and Σ ◦*C* from CH to itself, whilst the maps *GA* perform the same job for the functors idCCA<sup>1</sup> and *C* ◦ Σ from CCA<sup>1</sup> to itself; the naturality properties (C.40) and (C.41) precisely express commutativity of the above diagrams. Likewise for the other two cases, which restate Theorems C.45 and C.76. -

Similarly, Stone's Theorem D.5 is best seen categorically, stating that the category of Boolean lattices (with homomorphisms preserving ∨, ∧, and ⊥ as arrows) is dual to the category of Stone spaces (as a full subcategory of CH). With hindsight, Stone's Theorem (which predated category theory) was the first such duality result.

Definition E.4 may be *strengthened* by replacing the isomorphisms

$$F \circ G \cong \text{id}\_{\mathbb{D}};\tag{\mathbb{E}.21}$$

$$G \circ F \cong \text{id}\_{\mathbb{C}},\tag{\mathbb{E}.\mathcal{D}}$$

by equalities, i.e.,

$$F \diamond G = \text{id}\_{\mathbb{D}};\tag{\mathbb{E}.23}$$

$$G \circ F = \text{id}\_{\mathbb{C}}.\tag{\mathbb{E}.24}$$

In that case, the categories C and D are called *isomorphic*. However, this is less relevant than the following *weakening* of the first two conditions, called an *adjunction*:

Definition E.6. *Two functors F* : C → D *and G* : D → C *form an* adjoint pair *if there exist natural transformations* η *from* id<sup>C</sup> *to G*◦*F (called the* unit *of the adjunction), and* ε *from F* ◦*G to* id<sup>D</sup> *(called the* counit *of the adjunction), such that the following diagrams of natural transformarions (i.e. the* triangle identities*) commute:*

$$\underbrace{F \xrightarrow{F\_0 \circ \eta}}\_{\text{id}} \underbrace{FGF}\_{F} \qquad\qquad \qquad \underbrace{G \xrightarrow{\eta \circ G\_0}}\_{\text{id}} \underbrace{GFG}\_{\text{id}} \tag{E.25}$$

*We write F* 0 *G, and say that F is* left-adjoint *to G, or that G is* right-adjoint *to F.*

It is easy to see that if they exist, left or right adjoints are unique up to isomorphism. If we assume that C is locally small (in that all classes HomC(*x*, *y*) and HomD(*x* , *y* ) are sets), then the above definition states that the functors HomD(*F*(−),−) and HomC(−,*G*(−)), both defined from <sup>C</sup>op <sup>×</sup> <sup>D</sup> to Sets, are naturally isomorphic. In other words, for each *x* ∈ C<sup>0</sup> and *y* ∈ D0, we have a bijection:

$$\operatorname{Hom}\_{\mathsf{D}}(F(\mathsf{x}), \mathsf{y}') \cong \operatorname{Hom}\_{\mathsf{C}}(\mathsf{x}, G(\mathsf{y}')) \tag{\mathsf{E.26}}$$

that is natural in both variables *x* and *y* (i.e., for each *y* ∈ D0, the functors HomD(*F*(−), *y* ) and HomC(−,*G*(*y* )) from Cop to Sets are naturally isomorphic, and for each *x* ∈ C0, the functors HomD(*F*(*x*),−) and HomC(*x*,*G*(−)) from D to Sets are naturally isomorphic). Indeed, the natural bijection (E.26) is given by

$$\left(F(\mathbf{x}) \stackrel{f'}{\longrightarrow} \mathbf{y'}\right) \mapsto \left(\mathbf{x} \stackrel{\eta\_{\mathbf{x}}}{\longrightarrow} GF(\mathbf{x}) \stackrel{G(f')}{\longrightarrow} G(\mathbf{y'})\right);\tag{\mathbb{E}.27}$$

$$\left(\mathbf{x} \stackrel{f}{\longrightarrow} G(\mathbf{y'})\right) \mapsto \left(F(\mathbf{x}) \stackrel{F(f)}{\longrightarrow} FG(\mathbf{y'}) \stackrel{\mathfrak{E}\_{\mathbf{y'}}}{\longrightarrow} \mathbf{y'}\right). \tag{\to.28}$$

This may even be interesting if C = D, and hence *F* : C → C and *G* : C → C. For example, a Heyting algebra *H* (seen as a posetal category) is home to an adjunction

$$(-) \land \text{y} \,\,\,\vdash \,\,\text{y} \,\,\,\,\text{--}\,\,\,\text{t} \,\,\, (-), \,\,\tag{E.29}$$

for any fixed *y* ∈ *H*, where, writing (E.29) as *F* 0 *G* as usual, we put

$$F\_0(\mathbf{x}) = \mathbf{x} \land \mathbf{y};\tag{\text{E.30}}$$

$$G\_0(x) = (\mathbf{y} \cdot \cdots \mathbf{y}).\tag{\mathbb{E}.\Im \mathbf{l}})$$

Definition E.4 of an equivalence of categories involves an adjunction *F* 0 *G* whose unit and counit are both natural *isomorphisms*, as opposed to mere natural *transformations*, as in Definition E.6 of an adjunction. In that case, *G* is an inverse to *F* up to isomorphism of objects (which still falls short of an exact inverse, which as mentioned would lead to the less important notion of isomorphism of categories). But even for an adjunction, one may regard *G* as a weak kind of inverse to *F*, which allows one to move between categories in the direction opposite to *F*.

Other than equivalences of categories, the traditional examples of adjunctions yield left adjoints to so-called *forgetful functors*, which strip some class of mathematical objects of (some of) its structure. For example, if Grp is the category of groups and homomorphisms, the forgetful functor *G* : Grp → Sets sends a given group to its underlying set; this functor has a left adoint *F* : Sets → Grp that assigns the free group on a set *X* to *X*. Similarly for vector spaces, Boolean algebras, etc.

We now move on to *limits* and *colimits*, whose general definition we precede by a few special cases. These abstract the corresponding constructions from Sets (and hence pave the way for topos theory, which resembles set theory in various ways), so that for the right "feeling" we switch to labeling objects in a category by capitals. Definition E.7. *Let* C *be a category (for simplicity assumed to be locally small).*

• *A* product *of a pair X*,*Y* ∈ C<sup>0</sup> *is an object X* ×*Y* ∈ C0*, with arrows p*<sup>1</sup> : *X* ×*Y* → *X and p*<sup>2</sup> : *X* ×*Y* →*Y , such that for all arrows q*<sup>1</sup> : *Z* → *X and q*<sup>2</sup> : *Z* →*Y , there is a unique arrow Z* → *X* ×*Y making the following diagram commute:*

$$\bigcup\_{X \xleftarrow{q\_1}}^{q\_1} \bigcup\_{X \times Y}^{Z} \xrightarrow{q\_2} Y$$

*If each pair of objects in* C<sup>0</sup> *has a product,* C *is said to have* binary products*.* The next part of the definition relies on the following fact about products, which is easy to prove: given *f* : *X* → *X* and *g* : *Y* → *Y* in C1, there is a unique arrow

$$f \times \mathbf{g} : X \times Y \to X' \times Y' \tag{\to.33}$$

such that the following diagram commutes:

$$\begin{array}{c} X \xleftarrow{p\_1} X \times Y \xrightarrow{p\_2} Y\\ Y \int\_{\begin{array}{c} p'\_1\\ \end{array}} \underbrace{\begin{array}{c} \exists! f \times g\\ \end{array}}\_{\begin{array}{c} X' \times Y' \xrightarrow{p'\_2} \end{array}} \xrightarrow{g} Y' \end{array} \tag{E.34}$$

• *A* function space *or* exponential *of a pair Y*,*Z* ∈ C<sup>0</sup> *in a category* C *with binary products is an object Z<sup>Y</sup>* <sup>∈</sup> <sup>C</sup><sup>0</sup> (which in Sets is the set of all functions *<sup>g</sup>* : *<sup>Y</sup>* <sup>→</sup> *<sup>Z</sup>*) *with an* evaluation map ev : *<sup>Z</sup><sup>Y</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> *<sup>Z</sup>* (which in Sets is (*g*, *<sup>y</sup>*) → *<sup>g</sup>*(*y*) <sup>∈</sup> *<sup>Z</sup>*)*, such that for each f* : *<sup>X</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> *Z there is a unique arrow* ˜*<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> *<sup>Z</sup><sup>Y</sup>* (which in Sets is ˜*f*(*x*)(*y*) = *f*(*x*, *y*)) *making the following diagram commute:*

$$X \times Y \xrightarrow{f \quad f} \begin{array}{c} \begin{array}{c} f \\ \text{\raisebox{-0.0pt}{ $\uparrow$ }} \\ \text{\raisebox{-0.0pt}{ $\uparrow$ }} \\ \end{array} \\ \begin{array}{c} \begin{array}{c} \\ \text{\raisebox{-0.0pt}{ $\uparrow$ }} \\ \text{\raisebox{-0.0pt}{ $\uparrow$ }} \\ \end{array} \\ \end{array} \end{array} \tag{E.35}$$


The relationship between products and function spaces is just the adjunction

$$(-) \times Y \dashv (-)^Y,\tag{E.36}$$

for each *Y* ∈ C0, where the left-hand side denotes the following functor:

$$(-)\times Y: \mathbb{C} \to \mathbb{C};\tag{\mathbb{E}.37}$$

$$X \mapsto X \times Y;\tag{E.38}$$

$$(f:X\to X')\mapsto(f\times \text{id}\_Y:X\times Y\to X'\times Y).\tag{\text{E.39}}$$

Here *f* ×id*<sup>Y</sup>* is a special case of (E.33), whilst the right-hand side of (E.36) is

$$(-)^Y: \mathbb{C} \to \mathbb{C};\tag{E.40}$$

$$\mathbf{Z} \mapsto \mathbf{Z}^{\mathbf{Y}};\tag{\mathbf{E}.41}$$

$$(\mathbf{g}:\mathbf{Z}\to\mathbf{Z'})\mapsto(\widehat{\mathbf{g}\diamond \mathbf{ev}}:\mathbf{Z}^Y\to(\mathbf{Z'})^Y),\tag{\to42}$$

where the arrow *g* ◦ ev is defined as in the text above (E.35), in which we substitute *X Z<sup>Y</sup>* , *Z Z* , and *<sup>f</sup> <sup>g</sup>* ◦ ev; note that the latter is an arrow *<sup>Z</sup><sup>Y</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> *<sup>Z</sup>* .

As in (E.26), the adjunction (E.36) gives a bijection

$$\operatorname{Hom}\_{\mathbb{C}}\left(X\times Y, Z\right) \cong \operatorname{Hom}\_{\mathbb{C}}\left(X, Z^{Y}\right),\tag{\mathbb{E}.43}$$

which of course is precisely the correspondence *<sup>f</sup>* <sup>↔</sup> ˜*<sup>f</sup>* ; the counit of (E.36) is <sup>ε</sup> <sup>=</sup> ev (i.e., its component at *<sup>Z</sup>* is ev : *<sup>Z</sup><sup>Y</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> *<sup>Z</sup>*), whereas the unit (at *<sup>Z</sup>*) is the map ˜*<sup>f</sup>* : *<sup>X</sup>* <sup>→</sup> *<sup>Z</sup><sup>Y</sup>* corresponding to *<sup>f</sup>* : *<sup>X</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> *<sup>Z</sup>* on the choices *<sup>X</sup> <sup>Z</sup>*, *<sup>Z</sup> <sup>Z</sup>* <sup>×</sup>*Y*, and *f* : *Z* ×*Y* → *Z* ×*Y* being the identity arrow.

The following construction, generalizing binary products, is very important.

Definition E.8. *The* pullback *of two arrows f* : *X* → *Z and g* :*Y* → *Z consists of two arrows p* : *P* → *X and q* : *P* → *Y such that the following square commutes,* and *has the universal property that for any arrows p* : *P* → *X and q* : *P* →*Y with f p* = *gq , there is a unique arrow h* : *P* → *P such that the entire diagram commutes:*

*One says that q is a pullback of f over g, whilst p is a pullback of g over f .*

In the category Sets, pullbacks coincide with fibered products, that is,

$$P = X \times\_Z Y \equiv \{ (\mathbf{x}, \mathbf{y}) \in X \times Y \mid f(\mathbf{x}) = \mathbf{g}(\mathbf{y}) \},\tag{\to.45}$$

where *p* and *q* are the projections on the first and the second coordinate, respectively. In particular, taking *Z* to be a singleton reproduces binary products as special cases of pullbacks. This can be done in all categories C with a terminal object.

At last, we turn to limits and colimits in a category. A (finite) *diagram* in a category C is a functor *D* : J → C, where J is some (finite) category. In case that J is empty, we say that there is a unique functor *D* into C; even this is interesting!

The diagram just consisting of two objects *X*,*Y* ∈ C<sup>0</sup> corresponds to *J*<sup>0</sup> = {0,1} with 0 = 1 and only identity arrows. The next case is an arrow *f* : *X* → *Y*, obtained from *J*<sup>0</sup> = {0,1} as a poset, i.e., 0 ≤ 1. Finally, consider *J*<sup>0</sup> = {0,1,2} with nontrivial arrows 0 → 1 and 2 → 1; this defines a diagram

$$Y \xrightarrow{g} Z \xleftarrow{f} X. \tag{E.46}$$

For any *C* ∈ C0, let *DC* : J → C be the constant functor that sends all *j* ∈ J<sup>0</sup> to *C*, and all arrows in J to id*C*. A *cone* over a diagram *D* : J → C is an object *C* ∈ C<sup>0</sup> (called the *vertex* of the cone) with a natural transformation from *DC* to *D*, i.e., a collection of arrows *cj* :*C* → *Dj* ≡ *D*0(*j*) indexed by *j* ∈ *J*0, such that for each arrow χ : *j* → *k* in *J*1, with induced arrow *J*1(χ) : *Dj* → *Dk*, the following triangle commutes:

$$C \xrightarrow{c\_k} \biguplus\_{c\_j} \uparrow\_{\mathbf{u}} \biguplus\_{D\_f} \tag{\mathbb{E}.47}$$

A cone over the empty diagram is just a loose object *C*. A cone over our two-object diagram without arrows is *X* ← *C* → *Y*, whereas a cone over (E.46) is a commuting square as in (E.44). A *limit* of a diagram *D* : J → C is a *universal cone* over *D*, i.e., a cone (*C*,{*c <sup>j</sup>* :*C* → *Dj*}*j*∈J<sup>0</sup> ) such that for any other cone (*C* ,{*c <sup>j</sup>* :*C* → *Dj*}*j*) for the same diagram there is a unique arrow *h* : *C* → *C* such that

$$c\_f \circ h = c'\_f \ (j \in \mathbb{J}\_0). \tag{E.48}$$

A more elegant way of phrasing this is via the category Cone(*D*), whose objects are cones over *D*, and whose arrows are arrows *h* : *C* → *C* in C<sup>1</sup> satisfying (E.48). A limit of *D*, then, is just a terminal object in Cone(*D*). Either way, it is clear from the universal property that any two limits of a given diagram must be isomorphic. Despite this lack of uniqueness, the typical notation for a limit of a diagram *D* is

$$C = \varprojlim\_{f} D\_f. \tag{E.49}$$

It should now be clear that a terminal object is a limit over the unique diagram over the empty category, a (binary) product is a limit over a two-object diagram obtained from *J*<sup>0</sup> = {0,1} with only identities, and finally a pullback is a limit over the diagram (E.46) obtained via *J*<sup>0</sup> = {0,1} seen as a posetal category.

Especially in connection with topos theory, the following fact is quite useful:

Proposition E.9. *A category has all finite limits (i.e. limits based on finite diagrams) iff it has all pullbacks and has a terminal object.*

Replacing <sup>C</sup> by its opposite category <sup>C</sup>op, we obtain the *colimit <sup>C</sup>* <sup>=</sup> lim−→ *<sup>j</sup> Dj* of a diagram, which is defined as a limit of the same diagram seen in Cop, so that in all definitions all arrows are reversed. Thus terminal objects are replaced by *initial objects*, products become *coproducts*, and pullbacks are turned into *pushouts*.

#### E.2 Toposes and functor categories

The last ingredient we need for the definition of a topos is a categorical abstraction of subsets *X* ⊂ *Y* and their characteristic functions 1*<sup>X</sup>* , i.e. a *subobject classifier*.


$$\begin{array}{c} X \xrightarrow{\equiv\,!} \begin{array}{c} \mathbf{1} \\ \downarrow \\ Y \xrightarrow{\mathcal{X}m} \mathbf{\mathcal{Q}} \end{array} \end{array} \tag{E.50}$$

It follows that the object 1 is terminal in C (which of course constrains C to have a terminal object in the first place); Ω is often called the *truth object* of C.

Proposition E.11. *If a locally small category* C *has a subobject classifier, then for any Y* ∈ C0*, the class* Sub(*Y*) *is a set, and the map m* → χ*<sup>m</sup> induces a bijection*

$$\text{Sub}(Y) \cong \text{Hom}\_{\mathbb{C}}(Y, \mathfrak{Q}).\tag{\mathbb{E}.\mathfrak{S}1}$$

*Proof.* It follows from the definition of a pullback that equivalent monos *m* : *X* →*Y* and *m* : *X* → *Y* yield the same arrow χ*m*, so that the map *m* → χ*<sup>m</sup>* from monos to arrows passes to equivalence classes, i.e., we have a map [*m*] → χ*m*. The universal part in the definition of a pullback (i.e., monos with the same classifying maps are isomorphic) makes the latter map injective, whereas surjectivity follows from the general fact that the pullback (namely *m*) of any arrow (namely χ) over a mono (namely *t*) is a mono, where we see (E.50) as a pullback *for given* χ *and t*. -

For example, in Sets a mono is an injective function (and an epi is surjective), so that any mono into *Y* originates in some set that is isomorphic to some subset of *Y*. Any singleton 1 = ∗ = {0/} serves as a terminal object, and Sets has a truth object

$$
\mathfrak{Q} = \underline{\mathfrak{I}} = \{0, 1\}, \tag{\text{E.52}}
$$

with subobject classifier *t*(∗) = 1; if *X* ⊂*Y*, and *m* is the inclusion map, then χ*<sup>m</sup>* = 1*<sup>X</sup>* is just the characteristic function of *X*, and Sub(*X*) ∼= P(*X*) is the power set of *X*.

The haunting name "truth object" for Ω might explain some of the fascination logicians and quantum physicists have felt for topos theory, which we now define:

Definition E.12. *A* topos *is a cartesian closed category (i.e., having a terminal object, binary products, and function spaces) with pullbacks and a subobject classifier.*

More precisely, this defines an *elementary topos*. It follows from Proposition E.9 that a topos has all finite limits, and it can be shown that it also has all finite colimits.

It should be clear that Sets is a topos; indeed, in our presentation the presence of the necessary ingredients of a topos within Sets partly motivated these ingredients. More generally, all toposes relevant to this book are of the following sort. We first note that for any two categories C and D we obtain a new category [C,D] whose objects are (covariant) functors from C to D, and whose arrows are natural transformations between such functors. It is often natural to consider *contravariant* functors, giving the category [Cop,D]. If D = Sets, such functors are called *presheaves* on C. The category [Cop,Sets] is often denoted by SetsCop . An important special case is

$$
\mathbb{C} = \partial(X),
\tag{\text{E.53}}
$$

i.e. the topology of some space *X* (seen as a posetal category); with slight abuse of notation, functors *<sup>F</sup>* : <sup>O</sup>(*X*)op <sup>→</sup> Sets are called *presheaves on <sup>X</sup>*.

Theorem E.13. *For any small category* C*, the category* [Cop,Sets] *is a topos.*

*Proof.* We focus on the subobject classifier; the remainder following from the fact that limits in [Cop,Sets] (including pullbacks and function spaces) are computed pointwise, i.e., if *<sup>D</sup>* : <sup>J</sup> <sup>→</sup> [Cop,Sets] is a diagram, then for each *<sup>C</sup>* <sup>∈</sup> <sup>C</sup><sup>1</sup> we obtain a diagram *DC* in Sets defined by *DC*(*j*) = *D*(*j*)(*C*). Since Sets has all limits, we obtain limits C*<sup>C</sup>* for each *DC*. These form a single functor C , which is a limit of *D*.

The simplest example is the *terminal object* in [Cop,Sets], which comes out as 10(*C*) = ∗ for each *C* ∈ C1, where ∗ is some arbitrary (but fixed) singleton.

To discuss the *truth object* in [Cop,Sets], we need a few definitions.


$$f^\*\mathbb{S} = \{ h: X \to D \mid fh \in \mathbb{S} \}.\tag{E.54}$$

*4. We denote the set of all sieves on C by* Sieves(*C*)*.*

Clearly, if id*<sup>C</sup>* ∈ *S*, then *S* = *S* (*C*) max. We will show that the truth object in [Cop,Sets] is

$$
\mathfrak{Q}\_0(\mathcal{C}) = \text{Sieves}(\mathcal{C});\tag{\mathcal{E}.\mathfrak{S}})
$$

$$
\mathfrak{Q}\_{\mathrm{l}}(f) = f^\*. \tag{E.56}
$$

The *subobject classifier* in [Cop,Sets], then, is the natural transformation

$$t: \mathbf{1} \to \mathfrak{Q};\tag{\text{E.57}}$$

$$\mathbf{1}\_{\mathcal{C}}: \mathbf{1}\_{\mathcal{0}}(\mathcal{C}) \to \text{Sieves}(\mathcal{C});\tag{\mathcal{E}.58}$$

$$t\_C(\*) = S\_{\text{max}}^{(C)}.\tag{E.59}$$

To understand this, we need the *Yoneda Lemma* E.15 below. In preparation, for any (fixed) *<sup>C</sup>* <sup>∈</sup> <sup>C</sup>0, we define a functor *yC* : <sup>C</sup>op <sup>→</sup> Sets by

$$(\chi\_C)\_0(D) = \operatorname{Hom}\_{\mathbb{C}}(D, \mathbb{C});\tag{E.60}$$

$$(\mathbf{y}\_C)\_1 \left( D \xrightarrow{f} D' \right) = (\mathbf{g} \mapsto \mathbf{g}f),\tag{\mathbb{E}.61}$$

the latter being a map from HomC(*D* ,*C*) to HomC(*D*,*C*). This is often written as

$$\mathbf{y}\_{\mathcal{C}} = \mathbf{Hom}\_{\mathcal{C}}(- , \mathcal{C}), \tag{\mathcal{E}.62}$$

and the functors *yC* are called *representable presheaves*. Since *f* : *C* → *C* induces a natural transformation *yC* → *yC* in the obvious way, i.e., its component τ*<sup>D</sup>* at *D* is the map *g* → *f g* from HomC(*D*,*C*) to HomC(*D*,*C* ), the map *C* → *yC* extends to a functor *<sup>y</sup>* : <sup>C</sup> <sup>→</sup> [Cop,Sets], called the *Yoneda embedding*.

Lemma E.15. *For any F* <sup>∈</sup> [Cop,Sets]*, any D* <sup>∈</sup> <sup>C</sup>0*, and x* <sup>∈</sup> *<sup>F</sup>*0(*C*)*, the map*

$$
\pi\_{\mathsf{D}}^{(\mathsf{x})}: \operatorname{Hom}\_{\mathsf{C}}(D, C) \to F\_{\mathsf{0}}(D); \tag{\mathsf{E}.63}
$$

$$\left(D \xrightarrow{f} C\right) \mapsto F\_1(f)(\mathbf{x}),\tag{\mathbf{E.64}}$$

*where F*1(*f*) *duly maps F*0(*C*) *to F*0(*D*)*, forms the component at D of a natural transformation* <sup>τ</sup>(*x*) *from yC to F, and the ensuing map x* → <sup>τ</sup>(*x*) *gives a bijection*

$$F\_0(\mathcal{C}) \cong \operatorname{Hom}\_{[\mathsf{C}^\text{op}, \mathsf{Sets}]}(\mathsf{y}\_{\mathcal{C}}, F). \tag{\mathcal{E}.65}$$

Recall that by definition of a functor category, the right-hand side of (E.65) consists precisely of the set Nat(*yC*,*F*) of natural transformations from *yC* to *F*.

Lemma E.16. *For any C* <sup>∈</sup> <sup>C</sup><sup>0</sup> *and S* <sup>∈</sup> Sieves(*C*)*, the presheaf X*(*S*) *defined by*

$$X\_0^{(S)}(D) = \operatorname{Hom}\_{\mathbb{C}}(D, \mathbb{C}) \cap S;\tag{E.66}$$

$$X\_1^{(S)}\left(D \xrightarrow{f} D'\right) = \{\text{gf} \mid \text{g} \in X\_0^{(S)}(D')\},\tag{E.67}$$

*defines a subobject m* : *<sup>X</sup>*(*S*) <sup>→</sup> *yC, and the ensuing map S* → *<sup>X</sup>*(*S*) *yields a bijection*

$$\text{Sieves}(\mathcal{C}) \xrightarrow{\cong} \text{Sub}(\mathbf{y}\_{\mathcal{C}}).\tag{\mathcal{E}.68}$$

More generally, if *X* and *Y* (generalizing *X*(*S*) and *yC* above) are presheaves on C with *X*0(*D*) ⊆ *Y*0(*D*) for all *D*, then the equivalence class of *X* is a subobject of *Y*.

The proof of Lemma E.16 below uses the converse fact: any subobject of *Y* has a representative *X* for which *X* is a *subfunctor* of *Y*, i.e., *X* <sup>0</sup>(*D*) ⊆ *Y*0(*D*) for all *D*, and *X* <sup>1</sup> is the restriction of *Y*1, as in (E.70). below. To see this, suppose one has a mono *m* : *X* → *Y*, so that each component *mD* : *X*0(*C*) → *Y*0(*D*) of *m* is an injective function. We now define a presheaf *X* on C by

$$X\_0'(D) = m\_D(X\_0(D)) \subseteq Y\_0(D);\tag{E.69}$$

$$X\_1'(f) = Y\_1(f)\_{|X\_0'(D')} \ (f:D \to D'). \tag{E.70}$$

Furthermore, we define a natural transformation *m* : *X* → *Y*, whose components *m <sup>D</sup>* : *X* <sup>0</sup>(*D*) →*Y*0(*D*) are given by set-theoretic inclusion. The natural transformation *h* : *X* → *X* , defined through its components *hD* = *m*˜ *<sup>D</sup>* (where ˜*m <sup>D</sup>* is *mD*, but seen as a map from *X*0(*D*) to *X* <sup>0</sup>(*D*) rather than to *Y*0(*D*)) then renders *m* and *m* isomorphic.

*Proof.* The map *<sup>S</sup>* → *<sup>X</sup>*(*S*) has an inverse *<sup>X</sup>* → *SX* , where *SX* <sup>∈</sup> Sieves(*C*) is given by

$$S\_X = \bigcup\_{D \in \mathbb{C}\_0} X\_0(D). \tag{7}$$

Combining (E.55) - (E.56) with Lemma E.15 applied to *F* = Ω, gives

$$\operatorname{Hom}\_{[\mathbf{C}^{\mathsf{op}}, \mathsf{Sets}]}(\mathsf{y}\_{\mathsf{C}}, \mathfrak{Q}) \cong \operatorname{Sieves}(\mathsf{C}),\tag{\mathsf{E}.71}$$

so that Lemma E.16 yields a bijective correspondence between arrows from *yC* to Ω as defined in (E.55) - (E.56), and subobjects of *yC*. At *D*, diagram (E.50) is

$$\begin{array}{c} \text{Hom}\_{\mathbb{C}}(D, C) \cap S \xrightarrow{\equiv} \begin{array}{c} \exists! \\ \\ \\ \text{Hom}\_{\mathbb{C}}(D, C) \xrightarrow{(\mathcal{X}\_{\mathcal{D}})\_D} \text{Sieves}(D) \end{array} \end{array} \* \tag{E.72}$$

where *mD* is the inclusion map, *tD*(∗) = *S*max(*D*), and (χ*m*)*D*(*f*) = *f* <sup>∗</sup>*S*. Commutativity of this diagram follows from the fact that if *f* ∈ HomC(*D*,*C*) ∩ *S*, then *f* <sup>∗</sup>*S* is the maximal sieve on *D*, as trivially follows from the definition of a sieve. The pullback condition is then easy to verify from Lemma E.16.

If we replace *yC* by any presheaf *Y*, the classifying map χ*<sup>m</sup>* is given by

$$(\mathcal{X}\_m)\_D : Y\_0(D) \to \text{Sieves}(D);\tag{E.73}$$

$$\mathbf{x} \mapsto \{ f: D' \to D \mid Y\_1(f)(\mathbf{x}) \in \mathbf{X}\_0(D') \},\tag{\mathcal{E}.74}$$

noting that *Yi*(*f*) maps *Y*0(*D*) into *Y*0(*Z*), and *X*0(*Z*) ⊆*Y*0(*Z*), since we assume that *X* represents a subobject of *Y* such that *X*0(*D*) ⊆ *Y*0(*D*). This generalizes the previous case where *Y* = *yC*. To show that χ*<sup>m</sup>* is unique, we write down (E.50) at *D*:

$$\begin{array}{ccc} \text{X}\_0(D) \xrightarrow{\ni} & \begin{matrix} \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \\ \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \end{matrix} \xrightarrow{\scriptstyle} & \begin{matrix} \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \\ \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \end{matrix} \end{array} \begin{matrix} \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \\ \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \end{matrix} \end{array} \begin{matrix} \text{\raisebox{-0.0pt}{\$\scriptstyle\prime \, \vert}} \end{matrix} \tag{E.75}$$

Also, the condition that χ*<sup>m</sup>* be a natural transformation implies that the diagram

$$\begin{array}{c} Y\_0(D) \xrightarrow{(\mathcal{Z}\_m)\_D} \text{Sieves}(D) \\ Y\_1(f) \downarrow \\ Y\_0(D') \xrightarrow{(\mathcal{Z}\_m)\_D} \text{Sieves}(D') \end{array} \tag{E.76}$$

commutes for any *f* : *D* → *D*. Then (E.75) with *D D* implies that for any *y* ∈ *Y*0(*D* ), we have *y* ∈ *X*0(*D* ) iff (χ*m*)*<sup>D</sup>*(*y*) = *S*max(*D* ). In particular, we may take *y* = *Y*1(*f*)(*x*) for *x* ∈*Y*0(*D*), so that *Y*1(*f*)(*x*) ∈ *X*0(*D* ) iff χ*<sup>D</sup>*(*Y*1(*f*)(*x*)) = *S*max(*D* ). Commutativity of the diagram (E.76) gives (χ*m*)*<sup>D</sup>* ◦ *Y*1(*f*) = *f* <sup>∗</sup> ◦ (χ*m*)*D*, so that *Y*1(*f*)(*x*) ∈ *X*0(*D* ) iff *f* <sup>∗</sup>((χ*m*)*D*(*x*)) = *S*max(*D* ), which in turn is the case if and only if *f* ∈ (χ*m*)*D*(*x*). Hence we finally obtain

$$f \in (\mathbb{X}\_m)\_D(\mathfrak{x}) \text{ iff } Y\_1(f)(\mathfrak{x}) \in X\_0(D'), \tag{E.77}$$

which is the definition (E.73) - (E.74) of χ*m*, and renders it unique (given *m*).

Finally, the universal property of (E.50) follows from Proposition E.11: since if *X* in (E.50) is like *P* in (E.44), then *m* : *X* → *Y* is the pullback of χ over *t*, and hence *m* must be equivalent to *m*. But we know (cf. Definition E.10.2) that an equivalence between mono's is unique. This closes the proof of Theorem E.13. -

Refining presheaves, we also introduce the category Sh(*X*) of *sheaves* on *X*, which is the full subcategory of [O(*X*)op,Sets] defined by the following condition.

Definition E.17. *A presheaf F* : <sup>O</sup>(*X*)op <sup>→</sup> Sets *on X is a* sheaf *if for any open U* ∈ O(*X*)*, any open cover U* = ∪*jUj of U, and any family* {*sj* ∈ *F*0(*Uj*)} *such that*

$$F\_1(U\_{jk} \le U\_j)(\mathbf{s}\_j) = F\_1(U\_{jk} \le U\_k)(\mathbf{s}\_k),\tag{E.78}$$

*for all j*, *k, there is a unique s* ∈ *F*(*U*) *such that sj* = *F*(*Uj* ≤ *U*)(*s*) *for all j.*

Here *Ujk* =*Uj* ∩*Uk*, and *F*1(*V* ≤*W*): *F*0(*W*) → *F*0(*V*) is the arrow part of the functor *F*. If *F* is a sheaf on *X*, then for each open *U* = ∪*j*∈*JUj*, it has the continuity property

$$F\_0(U) = \varprojlim\_{} F\_0(U\_j),\tag{E.79}$$

where the limit is defined with respect to the diagram *D* : J → Sets where J is the posetal category whose objects are *j* ∈ *J*, and (*i*, *j*) ∈ *J*×*J* provided *Ui j* = 0, ordered / by *i* ≤ (*i*, *j*) and *j* ≤ (*i*, *j*), with *D*(*i*) = *F*(*Ui*) and *D*(*i*, *j*) = *F*(*Ui j*), etc.

A key example of a sheaf on *X* is the *sheaf of continuous functions*, where

$$F\_0(U) = \mathcal{C}(U, \mathbb{R}).\tag{\mathbb{E}.80}$$

If *U* ≤ *V*, then the associated map *F*1(*U* ≤ *V*) : *C*(*V*,R) → *C*(*U*,R) is simply given by restriction. Sheaves may be defined far more generally (as done by Grothendieck), namely on a *site* (which is a category equipped with a so-called *Grothendieck topology*), but sheaves on a space are all we need in this book.

Analogously to Theorem E.13, Sh(*X*) is a topos, whose truth object is the sheaf

$$
\mathfrak{Q}\_0(U) = \mathcal{O}(U);\tag{\mathbb{E}.81}
$$

$$\mathfrak{Q}\_1(U \le V) = (-) \cap U,\tag{E.82}$$

i.e., if *W* ∈ O(*V*), then Ω1(*W*) = *W* ∩*U* ∈ O(*U*). With the terminal object in Sh(*X*) being borrowed from [O(*X*),Sets], i.e., 10(*U*) = ∗, its subobject classifier is

$$
\hbar\_U(\*) = U.\tag{E.83}
$$

In fact, let *X* be a poset equipped with its intrinsic *Alexandrov topology*, whose open sets are the *upper sets*, i.e. those *U* ⊆ *X* for which *x* ∈ *U* and *x* ≤ *y* implies *y* ∈ *U*. Examples of opens are *up-sets U* = ↑*x* = {*y* ∈ *X* | *x* ≤ *y*}, which form a basis of the Alexandrov topology; in fact, ↑*x* is the smallest open set containing *x*. For any *x* ∈ *X*, we write Upper(*x*) for the set of all upper sets containing ↑*x*.

Proposition E.18. *If X is a poset, the category* [*X*,Sets] *of functors F* : *X* → Sets *(where X is seen as a category defined by the underlying poset) is isomorphic to the category* Sh(*X*) *of sheaves on X (equipped with the Alexandrov topology), i.e.,*

$$[X, \mathsf{Sets}] \cong \mathsf{Sh}(X). \tag{\mathsf{E.84}}$$

Note that [*X*,Sets] consists of presheaves on *<sup>X</sup>*op (in which *<sup>x</sup>* <sup>≤</sup> *<sup>y</sup>* iff *<sup>y</sup>* <sup>≤</sup> *<sup>x</sup>* in *<sup>X</sup>*).

*Proof.* This isomorphism is given by mapping a functor *F* : *X* → Sets to a sheaf *<sup>F</sup>* : <sup>O</sup>(*X*)op <sup>→</sup> Sets, by defining the latter on a basis of the Alexandrov topology as

$$F(\uparrow x) = \underline{F}(x),\tag{E.85}$$

extended to general Alexandrov opens by (E.79). *Vice versa*, a sheaf *F* on *X* immediately defines *F* by reading (E.85) from right to left. -

Corollary E.19. *If X is a poset, the subobject classifier in* [*X*,Sets] *is given by*

$$\mathcal{Q}\_0(\mathfrak{x}) = \text{Upper}(\mathfrak{x});\tag{\text{E.86}}$$

$$\mathfrak{Q}\_1(\mathfrak{x} \le \mathfrak{y}) = (-) \cap (\uparrow \mathfrak{y});\tag{E.87}$$

$$t\_{\mathbf{x}}(\*) = \uparrow \mathbf{x}.\tag{\mathcal{E}.88}$$

*Proof.* If C is a poset *X*, then a sieve on *x* ∈ *X* is a *lower subset* of ↓ *x* (i.e., if *y* ∈ *S* then *y* ≤ *x*, and if also *z* ≤ *y*, then *z* ∈ *S*). Recalling the comment after (E.84), the claim then follows from (E.55) - (E.59). Alternatively, using Proposition E.18, the claim also follows from (E.86) - (E.88). -

#### E.3 Subobjects and Heyting algebras in a topos

There are numerous connections between topos theory and intuitionistic logic, most of which generalize links between set theory and classical logic. The beginning of algebraic logic was Boole's work, which in modern parlance structured the power set P(*X*) of any set as a Boolean lattice, and hence provided a semantics for classical propositional logic, cf. §D.2. From a categorical view, P(*X*) is the set Sub(*X*), cf. (E.52) and subsequent text. This generalizes to any topos in which Sub(*X*) is a set (rather than a proper class), except for the decisive difference that Sub(*X*) is no longer a Boolean lattice but a *Heyting algebra*, making topos logic intuitionistic.

Proposition E.20. *For any object X in a locally small topos* T*, the set* Sub(*X*) *of subobjects of X is a Heyting algebra with respect to the partial ordering* ≤ *defined by* [*m* : *U* → *X*] ≤ [*m* : *V* → *X*] *iff there is h* : *U* → *V such that m h* = *m.*

It is easy to show from Definition E.10.1 that ≤ is well defined, and since it is defined "on the nose", i.e., at the level of representatives of the equivalence classes in question, in what follows we will use mono's rather than their equivalence classes.

*Proof.* Since we only need this result for presheaf toposes, we just list the pertinent operations, and omit the verification of the details (which is left to the reader).


$$\begin{array}{ccc} U \wedge V & \xrightarrow{p} & U \\ \downarrow & & \downarrow\_m \\ V & \xrightarrow{m'} X, \end{array} \tag{E.89}$$

so that the desired arrow *U* ∧*V* → *X* is *mp* = *m q* (which is indeed a mono).

• The *sup* of *m* : *U* → *X* and *m* : *V* → *X* is more complicated. In any topos T, arrows have an epi-mono factorization *f* = *me*, where *m* is mono (and as such is unique up to isomorphism), called the *image* of *f* , and *e* is epi. Furthermore, T has finite colimits including coproducts. Reversing all arrows in (E.32) gives

$$\bigcup\_{U \xleftarrow{m}\_{p\_2}}^{X} \bigcap\_{U+V \xleftarrow{m'}}^{X} \tag{E.90}$$

The sup *U* ∨*V*, then, is "the" image of the arrow *f* in this diagram.

• Finally, *implication*  is defined in terms of an *equalizer*, which may be constructed as a pullback, as follows: taking *Y* = *Z* = *X* and *q*<sup>1</sup> = *q*<sup>2</sup> = id*<sup>X</sup>* in (E.32) gives a unique arrow Δ*<sup>X</sup>* : *X* → *X* × *X*, called the *diagonal*; in Sets it is Δ*<sup>X</sup>* (*x*)=(*x*, *x*). Furthermore, if we have two arrows *f*,*g* : *X* → *Y*, taking *Z X*, *X Y*, *q*<sup>1</sup> *f* , and *q*<sup>2</sup> *g* in (E.32) gives a unique arrow (*f*,*g*) : *X* → *Y* ×*Y*, which in Sets is of course given by (*f*,*g*)(*x*)=(*f*(*x*),*g*(*x*)).

The equalizer of *f* and *g*, then, is the arrow *e* : *E* → *X* in the pullback

$$\underbrace{E}\_{X} \xrightarrow{p \quad \text{---}} \underbrace{Y}\_{(f,\text{-}g)} \tag{E.91}$$

The equalizer indeed deserves its name, because the map *p* equals both *f e* and *ge*; in Sets, *E* ⊆ *X* may simply be taken to be the subset on which *f* and *g* coincide.

We return to our monos *m* : *U* → *X* and *m* : *V* → *X*, with inf *U* ∧*V*: the mono (*U V*) → *X* is the equalizer of the classifying maps χ*<sup>U</sup>* ,χ*U*∧*<sup>V</sup>* : *X* → Ω. -

Recall that in Sets we may identify Sub(*X*) with the power set P(*X*), so that

$$
\perp = \emptyset;\tag{E.92}
$$

$$
\top = X.\tag{E.93}
$$

For *U*,*V* ⊆ *X*, the above constructions reduce to the well-known expressions

$$U \le V \text{ iff } U \subseteq V;\tag{E.94}$$

$$U \wedge V = U \cap V;\tag{\mathbb{E}.95}$$

$$U \lor V = U \cup V;\tag{\text{E.96}}$$

$$U \dashrightarrow V = U^{c} \cup V,\tag{\mathbb{E}.97}$$

where for comparison below we may rewrite the right-hand side of (E.97) as

$$U^{\mathbb{C}} \cup V = \{ \mathbf{x} \in X \mid \mathbf{x} \in U \to \mathbf{x} \in V \}. \tag{\mathcal{E}.98}$$

The (derived) expression (D.12) for negation then equates ¬ with complementation:

$$\neg U = U^c = \{ \mathbf{x} \in X \mid \mathbf{x} \notin U \}. \tag{\to.99}$$

In a presheaf topos [Cop,Sets], one obtains similar expressions for <sup>⊥</sup> and #, viz.

$$
\perp\_0(\mathbf{C}) = \emptyset;\tag{\text{E.100}}
$$

$$\top\_0(\mathbb{C}) = X(\mathbb{C}),\tag{\mathbb{E}.101}$$

where the functor <sup>⊥</sup> is the initial object in [Cop,Sets]. The logical connectives resemble the set-theoretic case, too, except for the last ones: if *U* and *V* are representatives of subobjects of *X* such that *U*0(*C*) ⊆ *X*0(*C*) and *V*0(*C*) ⊆ *X*0(*C*), we have:

822 E Category theory and topos theory

$$U \le V \text{ iff } U\_0(\mathbb{C}) \subseteq V\_0(\mathbb{C}) \text{ for all } \mathbb{C}; \tag{\mathbb{E}.102}$$

$$(U \wedge V)\_0(\mathcal{C}) = U\_0(\mathcal{C}) \cap V\_0(\mathcal{C});\tag{\mathbb{E}.103}$$

$$(U \lor V)\_0(\mathcal{C}) = U\_0(\mathcal{C}) \cup V\_0(\mathcal{C});\tag{\text{E.104}}$$

$$(U \dashv \rightharpoonup V)\_0(C) = \{ \mathbf{x} \in X\_0(C) \mid \forall D \stackrel{f}{\to} \mathcal{C}: X\_1(f)(\mathbf{x}) \in U\_0(D) \Rightarrow X\_1(f)(\mathbf{x}) \in V\_0(D) \}; \tag{\text{E.105}}$$

$$\neg U\_0(C) = \{ \mathbf{x} \in X\_0(C) \mid \forall D \stackrel{f}{\to} C: X\_1(f)(\mathbf{x}) \in U\_0(D) \Rightarrow X\_1(f)(\mathbf{x}) \notin V\_0(D) \}. \tag{\text{E.106}}$$

This Heyting algebra is Boolean iff ¬¬*U* = *U* for each *U*, so we are interested in

$$\neg\neg U\_0(\mathcal{C}) = \{ \mathbf{x} \in X\_0(\mathcal{C}) \mid \forall D \stackrel{\mathcal{g}}{\rightarrow} \mathcal{C} \exists E \stackrel{f}{\rightarrow} D : X\_1(\mathcal{g}f)(\mathbf{x}) \in U\_0(E) \}. \tag{\mathcal{E}.107}$$

It can be shown that Sub(*X*) is Boolean for each object *X* iff Sub(Ω) is Boolean. In order to settle this, we specialize (E.107) to subfunctors *m* : *U* → Ω, which gives

$$\neg\neg U\_0(C) = \{ S \in \text{Sieve}(C) \mid \forall D \stackrel{\&}{\rightarrow} C \exists E \stackrel{f}{\rightarrow} D : (\text{g}f)^\* S \in U\_0(E) \}. \tag{E.108}$$

For example, if C = *X*op is a posetal category, this expression becomes

$$\neg\neg U\_0(\mathbf{x}) = \{ \mathbf{S} \in \mathbf{U} \mathtt{pper}(\mathbf{x}) \mid \forall \mathbf{y} \ge \mathbf{x} \exists z \ge \mathbf{y} : \mathbf{S} \cap (\uparrow z) \in U\_0(z) \},\tag{\to.109}$$

which is clearly an additional property of *S* ∈ *U*0(*x*); examples abound in Chapter 12. Thus the (propositional) logic of Sub(*X*) may be genuinely intuitionistic (and given our examples, this conclusion especially applies to quantum logic).

Although *X* is an object in a topos, Sub(*X*) is a Heyting algebra in ordinary set theory. This is called an *external* description of *X*. Alternatively, one may study a topos using so-called *internal* reasoning. We will develop the logical foundation of internal reasoning (at least to some extent) in the next section, and for the moment just look at a special example, namely Heyting algebras *within some given topos*.

Definition E.21. *Let* T *be a topos (more generally, a category with all finite limits).*

	- *1. The diagrammatic version of* reflexivity *(in set theory: x* ≤ *x ) holds, as follows. The diagonal* Δ*<sup>X</sup>* ≡ Δ : *X* → *X* ×*X factors through m*≤*, i.e. there is an arrow X* → *R such that the following diagram commutes:*

$$\lambda \xrightarrow[X \times X]{} \sum\_{\lambda} \sideset{}{\ast}{\sum}\_{X \times X} \tag{E.110}$$

*2. The diagrammatic version of* transitivity *(in set theory: x* ≤ *y and y* ≤ *z imply x* ≤ *z) holds, as follows. First, define P as the pullback*

$$\begin{array}{c} P \xrightarrow{p} R \\ \downarrow \\ R \xrightarrow{p\_2 \circ m\_{\leq}} X, \end{array} \begin{array}{c} \begin{array}{c} R \\ \\ \\ X, \end{array} \end{array} \tag{E.111}$$

*where p*1, *p*<sup>2</sup> : *X* ×*X* → *X are the arrows in* (E.32)*, and p*,*q are defined as in* (E.44)*. The arrows p*<sup>1</sup> ◦*m*<sup>≤</sup> ◦ *p* : *P* → *X and p*<sup>2</sup> ◦*m*<sup>≤</sup> ◦*q* : *P* → *X, then yield an arrow P* : *X* → *X* ×*X via* (E.32)*, which must factor through m*≤*, too.*

• *A* partial order *on X is a preorder that is* antisymmetric *(in set theory: x* ≤ *y and y* ≤ *x imply x* = *y), in the following sense. First, define the* twist map

$$
\pi: X \times X \to X \times X \tag{\text{E.112}}
$$

*by taking Z X* ×*X,Y X, q*<sup>1</sup> *p*<sup>2</sup> *and q*<sup>2</sup> *p*<sup>1</sup> *in* (E.32)*; in set theory, this would be* τ(*x*, *y*)=(*y*, *x*)*. This enables us to reverse the order by defining a monic*

$$m\_{\geq}: \mathcal{R} \to X \times X;\tag{\mathbb{E}.113}$$

$$m\_{\geq} = \mathfrak{T} \circ m\_{\leq},\tag{\text{E.114}}$$

*with associated pullback*

$$\begin{array}{c} P' \xrightarrow{p'} \begin{array}{c} R\\ \downarrow\\ R \xrightarrow{m\_{\geq}} \begin{array}{c} X \end{array} \end{array} \end{array} \tag{E.115}$$

*The arrow m*<sup>≤</sup> ◦ *p* = *m*<sup>≥</sup> ◦ *q* : *P* → *X, then, must factor through* Δ : *X* → *X* ×*X.* • *A* lattice *in* T *is a partial order on some object X for which there are arrows*

$$
\wedge \colon X \times X \to X; \tag{E.116}
$$

$$
\vee : X \times X \to X,\tag{E.117}
$$

*such that:*


$$\begin{array}{c} X \xrightarrow{\bigwedge^{\wedge}} X \times X \\ X \times X \xrightarrow{c} X \xrightarrow{c} X \times X \times X \\ X\_{p\_1} \end{array} \tag{E.118}$$
  $\begin{array}{c} \begin{array}{c} \begin{array}{c} \text{ $c$ } \end{array} \end{array} \end{array} \qquad \begin{array}{c} \begin{array}{c} \begin{array}{c} \text{ $c$ } \end{array} \end{array} \end{array} \qquad \begin{array}{c} \begin{array}{c} \text{ $c$ } \end{array} \end{array} \end{array} \tag{E.118}$ 

824 E Category theory and topos theory

*Here the middle arrow c is the composition*

$$X \times X \stackrel{\Delta \times \text{id}\_X}{\longrightarrow} (X \times X) \times X \stackrel{\cong}{\longrightarrow} X \times (X \times X) \stackrel{\text{id}\_X \times \text{\tau}}{\longrightarrow} X \times X \times X. \tag{E.119}$$

• *Let* 1 *be 'the" terminal object in* T*, with associated arrow X* → *X* ×1 *from* (E.32)*, with Z X, q*<sup>1</sup> id*<sup>X</sup> , and Y* 1*. A* top element *in an internal lattice is an arrow* # : 1 → *X such that the following composite arrow is the identity* id*<sup>X</sup> :*

$$X \stackrel{\cong}{\longrightarrow} X \times \mathbf{1} \stackrel{\text{id}\_X \times \stackrel{\frown}{\longleftrightarrow}}{\longrightarrow} X \times X \stackrel{\wedge}{\longrightarrow} X. \tag{\to.120}$$

*A* bottom *element is an arrow* ⊥ : 1 → *X for which the following arrow is* id*<sup>X</sup> :*

$$X \stackrel{\cong}{\longrightarrow} X \times \mathbf{1} \stackrel{\text{id}\_X \times \perp}{\longrightarrow} X \times X \stackrel{\vee}{\longrightarrow} X. \tag{\to.121}$$

• *A* Heyting algebra *in* T *is a lattice X with* # *and* ⊥*, endowed with an arrow*

 : *X* ×*X* → *X*, (E.122)

*such that the monos m*<sup>1</sup> *and m*<sup>2</sup> *in the double pullback diagram*

$$\begin{array}{c} P\_1 \xrightarrow[m\_1 \times X \times X]{} \begin{array}{c} R \xleftarrow{} \begin{array}{c} R \xleftarrow{} \begin{array}{c} \begin{array}{c} R \end{array} \end{array} \end{array} \begin{array}{c} R\_2 \\ \downarrow \\ X \times X \end{array} \xrightarrow[n]{} X \times X \xrightarrow{\operatorname{id}\_X \times \cdots \to} X \times X \end{array} \begin{array}{c} \begin{array}{c} \text{(E.123)} \end{array} \end{array} \end{array} \tag{E.123}$$

*are equivalent (and hence define the same subobject of X* ×*X* ×*X ).*

The reader may check that in Sets these definitions reduce to the usual ones; as one can clearly see, finding diagrammatic versions of familiar definitions is an art!

The most important example of an internal Heyting algebra in a topos is Ω.

Theorem E.22. *The truth object* Ω *in a topos* T *with subobject classifier t* : 1 → Ω*, is a Heyting algebra in the partial ordering m*<sup>≤</sup> : *R* → Ω ×Ω *defined as the equalizer of the projection p*<sup>1</sup> : Ω × Ω → Ω *and the classifying map* χ(*t*,*t*) : Ω × Ω → Ω *of the product arrow* (*t*,*t*) : 1 → Ω ×Ω *derived from t* : 1 → Ω*. In particular:*


$$(\mathbf{t}, \mathbf{id}\_{\Omega}) \cup (\mathbf{id}\_{\Omega}, \mathbf{t}) : (\mathbf{1} \times \Omega) \cup (\mathfrak{Q} \times \mathbf{1}) \to \mathfrak{Q} \times \mathfrak{Q} ; \tag{\text{E.124}}$$


*For every object Y* ∈ T0*, this structure makes* HomT(*Y*,Ω) *an external Heyting algebra (i.e., in* Sets*), such that* (E.51) *is an isomorphism of Heyting algebras.*

We omit the proof of Theorem E.22, which is a straightforward verification.

In no. 1, the arrow (*t*,*t*) is a special case of the arrow (*f*,*g*) defined just before (E.91). In no. 2, we need the following construction, applied to the arrows

$$(t, \operatorname{id}\_{\mathfrak{Q}}) : (\mathbf{1} \times \mathfrak{Q}) \to (\mathfrak{Q} \times \mathfrak{Q}); \tag{\text{E.125}}$$

$$(\operatorname{id}\_{\mathfrak{Q}},t) : (\mathfrak{Q}\times\mathbf{1}) \to (\mathfrak{Q}\times\mathfrak{Q}),\tag{\text{E.126}}$$

To define maps like the one in (E.124) in general, recall the the coproduct diagram

$$\lambda \xleftarrow{X}\_{\mathbb{Z}} \xleftarrow{\lambda + Y} Y \tag{E.127}$$

which is just the opposite of the product diagram (E.32). In particular, for any given mono's *m*<sup>1</sup> : *X* → *Z* and *m*<sup>2</sup> : *Y* → *Z*, we obtain a unique map

$$((m\_1, m\_2) : X + Y \to Z. \tag{E.128}$$

The image of the latter in the sense of its epi-mono factorization (*m*1,*m*2) = *me*, i.e.

$$((m\_1, m\_2) : X + Y \stackrel{\epsilon}{\longrightarrow} X \cup Y \stackrel{m}{\longrightarrow} Z,\tag{E.129}$$

is the mono denoted by *m*∪*m* : *X* ∪*Y* → *Z* (which is called *m* in the above diagram).

In no. 5, 0 is the initial object in T. Note that the truth arrow *t* : 1 → Ω is the same as the classifying map χid<sup>1</sup> of the identity arrow id<sup>1</sup> : 1 → 1, so that all arrows in Theorem E.22 are classifying maps.

In the presheaf topos [Cop,Sets], where C is any category, products are taken pointwise, and also, set-theoretic intersection commutes with pullback of sheaves:

$$f^\*(S \cap S') = f^\*(S) \cap f^\*(S') \ (f: D \to C; \ S, S' \in \text{Sies}(C)). \tag{E.130}$$

These facts imply that the component at ∧*<sup>C</sup>* of the natural transformation ∧ is just

$$\wedge\_{\mathcal{C}}(S, S') = S \cap S' \; (S, S' \in \text{Sieves}(\mathcal{C})),\tag{E.131}$$

which in turns implies that if *R* is taken to be a subfunctor of Ω ×Ω, so that

$$(m\_{\leq})\_{C} : R\_{0}(C) \longleftrightarrow \text{Sieves}(C) \times \text{Sieves}(C) \tag{E.132}$$

is the inclusion map, we have (*S*,*S* ) ∈ *R*0(*C*) iff *S* ⊆ *S* . We also find

$$\cdots \to\_C (S, S') = \{ f : D \to C \mid f^\*S \subseteq f^\*S' \}; \tag{E.133}$$

$$\neg\_{\mathcal{C}}(\mathcal{S}) = \{ f : D \to \mathcal{C} \mid f \notin \mathcal{S} \}, \tag{E.134}$$

which are easily checked to be natural in *C*. Finally, the top element #*<sup>C</sup>* ∈ Sieves(*C*) is the maximal sieve, and similarly the bottom element ⊥*<sup>C</sup>* is the empty sieve.

#### E.4 Internal frames and locales in sheaf toposes

As we have seen in §D.1 as well as in §C.11, a *complete* Heyting algebra is the same qua lattice structure as a *frame*, except that maps between frames are defined differently: a frame map is required to preserve order and arbitrary suprema, whereas a Heyting algebra frame map preserves order and implication. Furthermore, one has *locales*, which are frames, too, except that maps go in the opposite direction. Hence if Frm is the category of frames (within Sets), then the category Loc of locales is

$$\mathsf{Loc} = \mathsf{Frm}^{\mathsf{op}}.\tag{\mathsf{E.135}}$$

We also recall the bizarre (but wonted) notation *X* for an object in Loc that is *the same* as the object denoted by O(*X*) in Frm, where nothing is implied about the spatiality of the frames in question (i.e., it is not necessarily the case that there is an actual space *X* of which the given frame called O(*X*) is the topology). In the same spirit, *frame* maps are written *<sup>f</sup>* <sup>∗</sup> : <sup>O</sup>(*Y*) <sup>→</sup> <sup>O</sup>(*X*) or *<sup>f</sup>* <sup>−</sup><sup>1</sup> : <sup>O</sup>(*Y*) <sup>→</sup> <sup>O</sup>(*X*), the corresponding *locale* map being *f* : *X* → *Y* (which is *the same map between the same objects*), once again, even if no space *X* in the usual sense is around.

In any case, in order to define internal frames, locales, or complete Heyting algebras in a topos, one must define completeness of internal lattices. This is difficult diagrammatically, but it can be done through the internal language of §E.5, e.g. by

$$\vdash \forall\_{\mathcal{S}} \exists x \left( \mathcal{S} \subseteq \downarrow x \right) \land \forall\_{\mathcal{Y}} \left( \mathcal{S} \subseteq \downarrow \mathcal{y} \to x \le \mathcal{y} \right), \tag{E.136}$$

where *<sup>S</sup>* <sup>⊆</sup> *<sup>X</sup>* and *<sup>x</sup>*, *<sup>y</sup>* <sup>∈</sup> *<sup>X</sup>* (technically, *<sup>S</sup>* is a variable of type <sup>Ω</sup>*<sup>X</sup>* , and *<sup>x</sup>* and *<sup>y</sup>* are variables of type *X*, see §E.5). We may avoid this, however, since due to the identification (E.84) in Chapter 12 we can work in a sheaf topos Sh(*X*), where *internal* frames have a simple *external* description, as follows: there is an equivalence

$$\mathsf{FrmSm}\_{\mathsf{Sh}(X)} \simeq \left( \mathsf{FrmSm}\_{\mathsf{Sets}} / \mathcal{O}(X) \right)^{\mathrm{op}} \tag{E.137}$$

between the category of internal frames in Sh(*X*) and the category of frame maps in Sets with domain O(*X*), where the arrows between two such maps

$$
\pi\_Y^{-1} \colon \mathcal{O}(X) \to \mathcal{O}(Y);\tag{E.138}
$$

$$
\pi\_{\mathbf{Z}}^{-1} : \mathcal{O}(\mathbf{X}) \to \mathcal{O}(\mathbf{Z});\tag{\text{E.139}}
$$

are the frame maps

$$\!\!\!\!\!\!\!\!\/-^{1}: \mathcal{O}(Z) \to \mathcal{O}(Y) \tag{E.140}$$

that satisfy

$$
\mathfrak{sp}^{-1} \circ \mathfrak{m}\_{\mathbb{Z}}^{-1} = \mathfrak{x}\_{\mathbb{Y}}^{-1}.\tag{E.141}
$$

This looks more palpable in terms of the "virtual" underlying spaces (i.e. locales): If (E.138) - (E.140) are seen as inverse images of maps π*<sup>Y</sup>* : *Y* → *X* , π*<sup>Z</sup>* : *Z* → *X*, and ϕ : *Y* → *Z*, then the condition (E.141) corresponds to the equality π*<sup>Z</sup>* ◦ϕ = π*<sup>Y</sup>* .

To explain the equivalence (E.137), we underline locales in Sh(*X*), writing *Y* etc.; the corresponding internal frame is denoted by O(*Y*) (which is the same object in Sh(*X*) as *Y*). The external description of *Y* in Sets, then, is a continuous map

$$
\pi: Y \to X,\tag{E.142}
$$

where *Y* is a locale in Sets (in which *X* was a a space to begin with), with frame

$$
\mathcal{O}(Y) = \mathcal{O}(\underline{Y})(X). \tag{E.143}
$$

Also here, the notation π : *Y* → *X* is purely symbolic, and stands for a frame map

$$
\pi^{-1}: \mathcal{O}(X) \to \mathcal{O}(Y),
\tag{E.144}
$$

from which one may reconstruct *Y* as the sheaf

$$\mathcal{O}(\underline{Y}): U \mapsto \{ V \in \mathcal{O}(Y) \mid V \le \pi^{-1}(U) \}\ \ (U \in \mathcal{O}(X)).\tag{E.145}$$

The frame maps (E.138) - (E.140) yield an internal frame map <sup>ϕ</sup>−<sup>1</sup> : <sup>O</sup>(*Z*) <sup>→</sup> <sup>O</sup>(*Y*) in Sh(*X*), which is a natural transformation, by defining its components as

$$\underline{\Phi}^{-1}(U) : \downarrow \pi\_{\mathbf{Z}}^{-1}(U) \to \downarrow \pi\_{Y}^{-1}(U); \tag{\mathcal{E}.146}$$

$$\mathcal{S} \mapsto \boldsymbol{\mathfrak{op}}^{-1}(\mathcal{S}).\tag{E.147}$$

As an application, the Dedekind real numbers R can be axiomatized by what is called a *geometric propositional theory* T. In any topos T (with natural numbers object), such a theory determines a certain frame O(T)T, whose "points" are defined as frame maps O(T) → Ω, where Ω is the subobject classifier in T (more precisely, the object of points of O(T) in T is the subobject of ΩO(T) consisting of frame maps). If TR is the theory axiomatizing R, in Sets one simply has the frame

$$
\mathcal{O}(\mathbb{T}\_{\mathbb{R}})\_{\mathsf{Sets}} = \mathcal{O}(\mathbb{R}), \tag{E.148}
$$

whose points are R. More generally, if T is some geometric propositional theory, and *X* is a space with associated sheaf topos Sh(*X*), then the internal frame O(T)Sh(*X*) is given by the sheaf (E.145) defined by taking the frame map (E.144) to be the inverse image map π−<sup>1</sup> <sup>T</sup> <sup>≡</sup> <sup>π</sup>−<sup>1</sup> <sup>T</sup> : O(*X*) → O(*X* × O(T)Sets) of the projection π<sup>T</sup> : *X* ×O(T)Sets → *X* onto the first component. Using (E.148), this yields the *frame* of Dedekind real numbers O(R) ≡ O(TR) in a sheaf topos Sh(*X*) as the sheaf

$$\mathcal{O}(\mathbb{R})\_{\mathsf{Sh}(X)} : U \mapsto \mathcal{O}(U \times \mathbb{R}). \tag{E.149}$$

The Dedekind real numbers *object*, on the other hand, is given by the sheaf

$$(\mathbb{R})\_{\mathsf{Sh}(\mathcal{X})}: U \mapsto C(U, \mathbb{R}). \tag{E.150}$$

Using (E.85), such results may immediately be transferred to T(*A*), see §12.1.

#### E.5 Internal language of a topos

The *internal language* (also called *Mitchell–Benabou language ´* ) of a given topos T looks like a first-order language, except that it is *typed* (i.e., many-sorted), in that each term σ has a certain type, written σ : *X*, indexed by the objects *X* of T. For example, formulae (by definition) have type Ω. In addition, symbols, terms, and formulae have a list FV(σ) of free variables. Furthermore, the internal language has a canonical model in which it may be interpreted, whose carrier is T itself. We often make no difference in notation between σ as an element of the internal language of T and its interpretation [[σ]] in T, which is some arrow in T; the two are so closely interwoven that making such a difference would be very artificial. Here are the rules.

	- 1. Constants and variables are terms of the given type.
	- 2. If τ : *X* is a term of type *X*, and *f* : *X* → *Y* is a function symbol, then *f*(τ) is a term of type *Y*, and FV(*f*(τ)) = FV(τ). Furthermore, [[ *f*(τ)]] = *f* ◦ τ = *f* τ.
	- 3. If we have *n* terms τ*<sup>i</sup>* : *Xi* (*i* = 1,...,*n*), with FV(τ1) = ··· = FV(τ*n*) ≡ *F*, then (τ1,..., τ*n*) is a term of type *X*<sup>1</sup> ×···×*Xn* and FV(τ1,..., τ*n*) = *F*. If τ*<sup>i</sup>* has interpretation τ*<sup>i</sup>* : *Y* → *Xi*, then (τ1,..., τ*n*) : *Y* → *X*<sup>1</sup> ×···×*Xn* is the corresponding product arrow, as defined (for *n* = 1) before (E.91).
	- 4. One may add free variables to terms; if τ : *Z* with interpretation τ : *X* → *Z* has a single free variable *x* : *X*, and we add a free variable *y*, then the interpretation of the revised term τ with FV(τ ) = {*x*, *<sup>y</sup>*} is <sup>τ</sup> : *<sup>X</sup>* <sup>×</sup>*<sup>Y</sup> <sup>p</sup>*<sup>1</sup> −→ *<sup>X</sup>* <sup>τ</sup> → *Z* (etc.).
	- 5. From τ : *X* with FV(τ) = {*z*1,...,*zn*} with *zi* : *Zi*, and *n* terms σ*<sup>i</sup>* : *Zi*, all having the same free variables FV(σ*i*) = {*y*1,..., *ym*}, with *yj* : *Yj*, we can form a new term τ(σ1,...,σ*n*) of type *X* (i.e. the same type τ had), with free variables

$$\text{FV}(\pi(\sigma\_1, \dots, \sigma\_n)) = \{\mathbf{y}\_1, \dots, \mathbf{y}\_m\}.\tag{\text{E.151}}$$

As the notation suggests, the interpretation of τ(σ1,...,σ*n*) is τ ◦(σ1,...,σ*n*).

	- 1. Let ϕ be a formula with FV(ϕ) = {*x*, *y*}, with *x* : *X* and *y* : *Y*. As in first-order logic, we may write <sup>ϕ</sup> as <sup>ϕ</sup>(*x*, *<sup>y</sup>*). Then {*<sup>x</sup>* <sup>|</sup> <sup>ϕ</sup>(*x*, *<sup>y</sup>*)} is a term of type <sup>Ω</sup>*<sup>X</sup>* , with

$$\text{FV}(\{x \mid \mathfrak{g}(x,\mathfrak{y})\}) = \{\mathfrak{y}\}.\tag{\text{E.152}}$$

This rule implements the isomorphism (sometimes called λ*-conversion*)

$$\operatorname{Hom}\_{\mathsf{T}}(X \times Y, \mathfrak{Q}) \cong \operatorname{Hom}\_{\mathsf{T}}\left(Y, \mathfrak{Q}^{X}\right), \tag{E.153}$$

which follows from the existence of exponentials in a topos. Indeed, (E.153) turns the interpretation <sup>ϕ</sup> : *<sup>X</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> <sup>Ω</sup> into an arrow {*<sup>x</sup>* <sup>|</sup> <sup>ϕ</sup>(*x*, *<sup>y</sup>*)} : *<sup>Y</sup>* <sup>→</sup> <sup>Ω</sup>*<sup>X</sup>* . Similarly, from ϕ : *X* ×*Y* → Ω we obtain a term {(*x*, *y*) | ϕ(*x*, *y*)} of type *X* ×*Y*, which is none other than the subobject classified by ϕ. Taking *Y* = 1 to be the terminal object and using (E.51), we see that

$$\operatorname{Hom}\_{\mathsf{T}}(X,\mathfrak{Q}) \cong \operatorname{Sub}(X) \cong \operatorname{Hom}\_{\mathsf{T}}\left(\mathbf{1},\mathfrak{Q}^{X}\right),\tag{E.154}$$

which shows that Ω*<sup>X</sup>* plays the role the power set P(*X*) of *X* plays in Sets.

2. If σ : *Y* and τ : *Y* are terms with the same free variables, then σ = τ is a formula having the same set of free variables as τ and σ. If σ : *X* → *Y* and τ : *X* → *Y*, then the interpretation [[σ = τ]] : *X* → Ω is the composite arrow

$$X \stackrel{(\sigma,\tau)}{\longrightarrow} Y \times Y \stackrel{=Y}{\longrightarrow} \mathfrak{Q},\tag{\text{E.155}}$$

where =*<sup>Y</sup>* is the classifying map of the diagonal Δ*<sup>Y</sup>* : *Y* → *Y* ×*Y*.

3. If <sup>τ</sup> : *<sup>Y</sup>* and <sup>σ</sup> : <sup>Ω</sup>*<sup>Y</sup>* are terms with the same free variables, then <sup>τ</sup> <sup>∈</sup> <sup>σ</sup> is a formula with the same free variables. If <sup>τ</sup> : *<sup>X</sup>* <sup>→</sup> *<sup>Y</sup>* and <sup>σ</sup> : *<sup>X</sup>* <sup>→</sup> <sup>Ω</sup>*<sup>Y</sup>* , then

$$[[\mathfrak{r}\in\mathfrak{o}]]: X \xrightarrow{(\mathfrak{r},\mathfrak{o})} Y \times \mathfrak{Q}^Y \xrightarrow{\mathfrak{ev}} \mathfrak{Q}.\tag{\text{E.156}}$$

4. As in first-order (or propositional) logic, new formulae may be made from old ones using the logical connectives ∧, ∨, →, and ¬. To interpret such composites, it is convenient to assume that their components have the same free variables, which can always be achieved using rule 4 for term-building above (i.e, by adding free variables). So let ϕ : *X* → Ω and ψ : *X* → Ω be (interpretations of) formulae, and let • be either ∧, ∨, or →. We then define

$$[[\![\mathfrak{q}\bullet\Psi]\!]:X\stackrel{(\mathfrak{q},\Psi)}{\longrightarrow}\mathfrak{Q}\times\mathfrak{Q}\stackrel{\bullet}{\longrightarrow}\mathfrak{Q},\tag{\text{E.157}}$$

where the arrow • : Ω ×Ω → Ω is defined from the Heyting algebra structure on Ω described in Theorem E.22. Similarly, negation is given by

$$[[\neg \mathfrak{q}]]: X \stackrel{\scriptstyle \mathfrak{q}}{\longrightarrow} \mathfrak{Q} \stackrel{\scriptstyle \neg}{\longrightarrow} \mathfrak{Q}.\tag{E.158}$$

5. If a formula ϕ(*x*, *y*) contains *x* freely, as well as other free variables collectively called *y*, then ∃*x*ϕ(*x*, *y*) is a formula, whose interpretation we now give, after a bit of preparation. First, consider the commutative diagram

$$\underbrace{P}\_{f^{\*m}} \xrightarrow{P} \underbrace{Z \xrightarrow{} \cdots \xrightarrow{} \mathbf{1}}\_{f^{\*m}} \underbrace{\mathbf{1}}\_{Z^{m}} \tag{\text{E.159}}$$

where *m* is a mono, so that its equivalence class defines an element of Sub(*Y*). Taking the pullback of either *m* and *f* , or, equivalently, of *t* and χ*<sup>m</sup>* ◦ *f* , we obtain a monic *f* <sup>∗</sup>*m* : *P* → *X*, whose equivalence class is an element of Sub(*Y*). Consequently, any arrow *f* : *X* → *Y* induces a map

$$f^\* : \operatorname{Sub}(Y) \to \operatorname{Sub}(X),\tag{E.160}$$

which is a homomorphism of (external) Heyting algebras (i.e. in Sets). For example, in Sets, where Sub(*X*) may be identified with P(*X*) (see comment after (E.52)), the map *<sup>f</sup>* <sup>∗</sup> : <sup>P</sup>(*Y*) <sup>→</sup> <sup>P</sup>(*X*) is simply the inverse image *<sup>f</sup>* <sup>−</sup><sup>1</sup> of *f* . If we regard the lattices Sub(*X*) and Sub(*Y*) as posetal categories, the map *f* ∗ has both a left-adjoint and a right-adjoint, denoted by

$$\exists\_f: \text{Sub}(X) \to \text{Sub}(Y);\tag{E.161}$$

$$\forall\_f: \text{Sub}(X) \to \text{Sub}(Y). \tag{E.162}$$

To justify this suggestive notation, replace *X* by *X* ×*Y* and take *f* : *X* ×*Y* →*Y* to be *p*<sup>2</sup> (i.e., projection on the second space). Hence this gives maps

$$\exists\_{P2} : \mathbf{Sub}(X \times Y) \to \mathbf{Sub}(Y); \tag{E.163}$$

$$\forall\_{P2} : \mathbf{Sub}(X \times Y) \to \mathbf{Sub}(Y). \tag{E.164}$$

In Sets, we identify the Heyting algebras Sub(*X* × *Y*) and Sub(*Y*) (now Boolean) with P(*X* ×*Y*) and P(*Y*), respectively, and obtain (on *A* ⊂ *X* ×*Y*):

$$\exists\_{p\_2}(A) = \{ \mathbf{y} \in Y \mid \exists\_{\mathbf{x} \in X} : (\mathbf{x}, \mathbf{y}) \in A \};\tag{\mathcal{E}.165}$$

$$\forall\_{p\_2}(A) = \{ \mathbf{y} \in Y \mid \forall\_{x \in X} \colon (\mathbf{x}, \mathbf{y}) \in A \}. \tag{E.166}$$

Returning to a general topos, given ϕ : *X* ×*Y* → Ω, the diagram

$$\underbrace{1}\_{\bullet} \xleftarrow[\underbrace{\begin{matrix} 1 \ \long{(x,y) \mid \mathfrak{q}(x,y)\} \ \longrightarrow \ \exists\_{p\_2} (\{(x,y) \mid \mathfrak{q}(x,y)\}) \ \longrightarrow \\ \downarrow \\ \Omega \xleftarrow[\underbrace{\begin{matrix} \emptyset \ \end{matrix}}\_{p\_1} X \times Y \xleftarrow[\underbrace{p\_2}\_{p\_2} ] \end{matrix}} \xrightarrow[\begin{matrix} \prod\_{x \in \mathfrak{q}(x,y)} \prod\_{y \in \mathfrak{q}} \end{matrix}]} \sideset{}{$$

defines the interpretation [[∃*x*ϕ(*x*, *y*)]] (with innocent abuse of notation in applying the map ∃*p*<sup>2</sup> ). The interpretation of ∀*x*ϕ(*x*, *y*) via ∀*p*<sup>2</sup> is quite similar.

We now define the (semantic) notion of *truth* for sentences in the internal language of a topos; this is a far-reaching categorical generalization of the idea initially studied in the straightforward context of propositional logic, cf. §D.2.

Definition E.23. *1. A sentence* ϕ *in the internal language of* T *is* true*, written* ϕ*, if its interpretation* [[ϕ]] *coincides with the subobject classifier t* : 1 → Ω*.*

*2. An open formula* ϕ(*x*) *is true if its interpretation* [[ϕ(*x*)]] : *X* → Ω *factors through t, or, equivalently, if (the interpretation of)* {*x* | ϕ(*x*)}*, seen as the subobject of X classified by* ϕ *(as explained between* (E.153) *and* (E.154)*), is X itself.*

The two clauses of this definition are actually equivalent, since no. 1 is obviously a special case of no. 2 by omitting the free variable *x* (and hence taking *X* = 1), but also, the second reduces to the first, because ϕ(*x*) is true iff ∀*x*ϕ(*x*) is true.

As a refinement of this concept of truth, for [[ϕ(*x*)]] : *X* → Ω as above, which we simply write as ϕ : *X* → Ω, take an arrow *f* : *Y* → *X*. By definition:

$$Y \Vdash \mathfrak{q}(f) \text{ means } \Vdash \mathfrak{q} \circ f. \tag{E.167}$$

If ϕ is a sentence (i.e. *X* = 1), this means that *Y* ϕ(*f*) iff ϕ = χ*<sup>Y</sup>* (in other words, ϕ classifies *Y* → 1). There are (at least) two applications of this idea:


Both play a role in Chapter 12, in the case where T = [Cop,Sets] for a poset C. By the Yoneda Lemma E.15, any arrow α : *yC* → *X* bijectively corresponds to some element α ∈ *X*0(*C*). In that case, we write*C* ϕ(α)for*C* ϕ(α ), which by (E.167) that the arrow ϕ ◦α : *yC* → Ω factors through the subobject classifier *t* : 1 → Ω.

*Kripke–Joyal semantics* unfolds the expression *Y* ϕ(*f*) by looking at the formula ϕ in terms of its constituent terms. As one sees in Chapter 12, since this procedure may be used iteratively, it is extremely useful for computational purposes.

Although more than we need (which is the posetal case), we now give the rules for the validity of *C* ϕ(α) in an arbitrary presheaf topos [Cop,Sets], as just mentioned; the posetal case follows in that *f* : *D* → *C* can only mean *D* ≤ *C*.

We use the following notation:


We then have the following *forcing rules*, which generalize the ones given at the end of §D.3, and should be seen as theorems of categorical logic and topos theory:


$$
\varphi(\alpha, \beta) : C \to \mathfrak{Q}; \tag{E.168}
$$

$$
\varphi(\alpha, \beta) \equiv \varphi \circ (\alpha', \beta'). \tag{E.169}
$$

is obtained by combining the maps α : *yC* → *X* and β : *yC* → *Y* into

$$(\alpha', \beta') : \mathbf{y}\_C \to X \times Y. \tag{E.170}$$

If ϕ has no free variables except *y*, then *C* ∃*y*ϕ(*y*) iff there is β ∈ *Y*0(*C*) such that *C* ϕ(β).

6. *C* ∀*y*ϕ(*y*)(α) iff *D* ϕ(α *f*,β) for each *f* : *D* → *C* and each β ∈ *Y*0(*D*). Here the arrow *f* : *D* →*C* induces a natural transformation *f* : *yD* → *yC*, yielding α *f* ≡ α ◦ *f* : *yD* → *X*, which combines with β : *yD* → *Y* to

$$(\alpha f, \beta) : \mathbf{y}\_D \to X \times Y. \tag{E.171}$$

Similarly to the previous case, If ϕ has no free variables except *y*, we have

$$C \Vdash \forall\_{\mathbf{y}} \mathfrak{g}(\mathbf{y}) \text{ iff } D \Vdash \mathfrak{g}(\mathcal{B}), \tag{E.172}$$

for each *f* : *D* → *C* and each β ∈ *Y*0(*D*).


$$(\sigma \circ \alpha', \tau \circ \alpha') : \chi\_{\mathcal{C}} \to \mathfrak{Q}^Y \times Y \tag{E.173}$$

factors through the subobject of <sup>Ω</sup>*<sup>Y</sup>* <sup>×</sup>*<sup>Y</sup>* that is classified by the evaluation map ev : <sup>Ω</sup>*<sup>Y</sup>* <sup>×</sup>*<sup>Y</sup>* <sup>→</sup> <sup>Ω</sup>. As a special case, take *<sup>Y</sup>* <sup>1</sup> and hence <sup>τ</sup> : *<sup>X</sup>* <sup>→</sup> <sup>1</sup>, so that

$$
\sigma: X \to \mathfrak{Q}^1 \cong \mathfrak{Q} \tag{\text{E.174}}
$$

corresponds to a subobject *S* → *X* (i.e. classified by σ ≡ χ). The above subobject of <sup>Ω</sup><sup>1</sup> <sup>×</sup><sup>1</sup> <sup>∼</sup><sup>=</sup> <sup>Ω</sup> is then simply given by the truth arrow *<sup>t</sup>* : <sup>1</sup> <sup>→</sup> <sup>Ω</sup>. Writing *<sup>x</sup>* <sup>∈</sup> *<sup>S</sup>* for τ ∈ σ (where *x* : *X* is a variable of type *X*), we therefore obtain the rule:

9. *C* (*x* ∈ *S*)(α) iff σ ◦α : *yC* → Ω factors through *t* (in other words, the subobject of *yC* classified by σ ◦α is *yC* itself).

Notes 833

#### Notes

The standard introduction to category theory by one of its founders is Mac Lane (1998); see also the book by his student Awodey (2010), as well as the lecture notes by van Oosten (2002) and Cheng (2002). A nice book, which studies set theory from the point of view of category theory, is Lawvere & Rosebrugh (2003). At highschool level, see also Lawvere & Schanuel (1997) or (informally) Cheng (2015).

Toposes were invented by Grothendieck in the early 1960s as part of his rebuilding of algebraic geometry; see Artin, Grothendieck, & Verdier (1972). The history and philosophy of category theory (including topos theory) has been described by Kromer (2007) and by Marquis (2009); for categorical logic see also Marquis & ¨ Reyes (2012) and Bell (2005). According to a leading category & topos theorist:

'category theory was the objective form of dialectical materialism (. . . ) set theory was considered to be essentially bourgeois since it is founded on the relationship of belonging.' (Marquis & Reyes, 2012, p. 30).

Books on topos theory and categorical logic we used include (in increasing order of scope and sophisitication): Goldblatt (1984), Bell (1988), Borceux (1994), Mac Lane & Moerdijk (1992), and last but not least, the encyclopedic Johnstone (2002).

## §E.1. Basic definitions

von Neumann–Bernays–Godel set theory is discusses in some detail in Mendel- ¨ son (2010); for algebraic set theory see Joyal & Moerdijk (1995). Category theorists also typically rely on the notion of a *Grothendieck Universe*, see e.g. Mac Lane (1998, §1.6), Marquis (2009, §5.5), and Kromer (2007, Ch. 6). ¨

## §E.2. Toposes and functor categories

An axiomatization of Grothendieck's toposes (and certain generalizations thereof) equivalent to Definition E.12 was given in 1970 by Lawvere and Tierney (it seems to have been customary among the pioneers of topos theory, who also include Joyal, not to publish their findings too lavishly and in fact no joint paper by Lawvere & Tierney recording their definition seems to exist at least in the open literature).

## §E.3. Subobjects and Heyting algebras in a topos

See Mac Lane & Moerdijk (1992), §§I.8, IV.8, and Borceux (1994), §1.2.

#### §E.4. Internal frames and locales in sheaf toposes

The external description of internal locales in sheaf toposes originates with Joyal & Tierney (1984); see also Johnstone (2002), §C1.6.

#### §E.5. Internal language of a topos

More details and proofs of the Kripke–Joyal semantics for the internal language of a topos may be found in Bell (1988), Ch. 4, Mac Lane & Moerdijk (1992), §IV.6, Borceux (1994), §6.6, and Johnstone (2002), §D1.2.

For an analysis of the notion of partial truth (as defined here) applied to quantum mechanics (differently from our Chapter 12), see Butterfield (2002).

## References


© The Author(s) 2017

K. Landsman, *Foundations of Quantum Theory*,

Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3

	- www.cs.ox.ac.uk/people/joel.ouaknine/download/arends09.pdf.
	- plato.stanford.edu/archives/spr2013/entries/qm-bohm/.
	- plato.stanford.edu/archives/sum2015/entries/compatibilism/.

## Index

∗-algebra, 645 ∗-algebras unitarily equivalent, 506 Abelian Higgs Model, 424 ability to do otherwise, 205 action algebroid, 260 action groupoid, 259 action Poisson bracket, 256 adjoint, 177, 497 adjointable, 678 adjunction, 809 counit, 809 triangle identities, 809 unit, 809 affine map, 28 affine subspace, 27 agency, 205 Alexandrov topology, 819 algebra at infinity, 318 σ, 366 algebra of observables, 27 almost everywhere (a.e.), 525 Amemiya–Araki Theorem, 780 anchor, 260 annihilation operator, 390 anti-automorphism, 145 anti-isomorphism, 91 approximate unit, 662 approximation, 369 arity, 793

arrow, 806 asymptotically abelian group action, 329 Atiyah–Singer index theorem, 581 atom, 777 atomic projection, 145 atomic proposition, 784 automorphism, 763 inner, 757 of *B*(*H*), 145 axiom, 794 baby index theorem, 581 Baire Category Theorem, 640 Banach algebra, 539 definition, 645 Banach space definition, 517 reflexive, 609 Banach–Alaoglu Theorem, 547 Banach–Steinhaus Theorem, 576 basis of a set with a transition probability, 32 of a Hilbert space, 564 orthonormal, 564 Bell basis, 210 Bell locality, 217 Bell's first theorem, 214 Bell's second theorem, 217

© The Author(s) 2017 K. Landsman, *Foundations of Quantum Theory*, Fundamental Theories of Physics 188, DOI 10.1007/978-3-319-51777-3 Bell–Kochen–Specker Theorem, 231 Berezin quantization, 252 Bessel's Inequality, 564 bicommutant, 590, 742 Big Picture (of quantization), 259 bipolar theorem, 244 Bogoliubov transformation, 394 Bogoliubov–Haag Hamiltonian, 433 Bohr, N., vii, viii, 2–12, 19, 80, 249, 317, 365, 435, 436, 439, 440, 444, 452, 453, 459, 460, 475 Bohr–Einstein debate, 7, 444 Bohrification, viii asymptotic, 11 exact, 11 Bohrification program, 10 Boltzmann distribution, 356 Boolean algebra free, 789 Borchers Theorem, 350 Borel sets, 523 Borel's law of large numbers, 309 Born measure, 104–108, 123, 234, 310–312, 314, 316, 317, 446, 447 Born probability, 20, 55, 56, 80, 200, 202, 219, 225, 233, 234, 240, 300, 317, 447, 486 Born rule, 40–42, 54–56, 107, 310–317 Born, M., 1, 80, 249, 435, 515 bottom element, 777, 820 in internal lattice, 824 boundary conditions free, 348 periodic, 348, 353 bounded linear map, 538 bounded transform, 626 boundedly complete lattice, 486 British Emergentism, 430 Brouwer, L.E.J., 13, 75, 79, 415, 459, 485, 686, 780, 790, 804 Bruhat space, 773 Busch's Theorem, 71

Butterfield's Principle, 12, 15, 370, 375, 378, 385, 396, 412, 415, 433, 452 C (*A*), 15, 334, 335, 341, 460–462, 464, 465, 474, 475, 480, 481, 484, 485, 490, 493, 775 C (*B*(*H*)), 14, 125–128, 333, 343, 494 Cech–Stone compactification, 548 ˇ 2-coboundary, 168, 169 2-cocycle, 168, 169 C\*-algebra, 3, 16 AF, 759 AW\*, 759 canonical anticommutation relations, 390 commutative, 27 completeness of (in topos), 464 crossed product, 735 definition, 645 definition (in topos), 464 finite-dimensional, 758, 759 group, 715, 734 internal (to a topos), 462 monotone complete, 759 nuclear, 706 pre-semi, 463 quasi-local, 318, 346 real rank zero, 759 reduced group, 734 reduced groupoid, 732 Rickart, 759 scattered, 759 transformation group, 733 von Neumann algebra, 759 W\*, 759 C\*-envelope, 772 C\*-seminorm, 711 canonical anticommutation relations, 390 self-dual formulation, 402 canonical commutation relations, 249 Caratheodory's Theorem, 29

carrier, 795 cartesian product, 801 Casimir operator, 158 categories dual, 808 equivalent, 808 isomorphic, 809 category, 806 cartesian closed, 811 large, 806 locally small, 806 of presheaves, 815 of sheaves, 818 opposite, 807 small, 806 Cauchy approximation (in topos), 464 convergence, 464 Cauchy sequence, 516 Cauchy–Schwarz inequality, 495 central extension, 171 character, 163, 172 character group, 715 characteristic function, 24, 814 Chernoff–Hoeffding bound, 330 Choquet ordering, 561 Choquet's Theorem, 557 Choquet–Meyer Theorem, 561 CHSH-inequality, 217 class, 806 classifying map, 814 clopen, 748 closed formula, 794 Closed Graph Theorem, 540 coadjoint orbit, 162 integral, 163 regular integral, 294 coadjoint representation, 98 coboundary maps, 169 Cohen–Hewitt Factorization Theorem, 771 Colbeck–Renner Theorem, 221–230 colimit, 707, 813 collapse of the wave-function, 4 coloring

of P1(*H*), 197 of R3, 196 common cause, 242 commutant, 50, 590, 742 commutation theorem, 713 commutative ∗-algebra maximal, 595 commuting unbounded operators, 108 compact-open topology, 715 Compatibilism, 205 Compatibility with Quantum Mechanics (assumption), 222 Complementarity, 4, 460 complete tensor product of Hilbert spaces, 313 complete vector field, 86 completely additive, 746 concave envelope, 558 conditional expectation, 277, 754 conditional probability, 241 cone, 813 universal, 813 congruence, 467 conjugate exponents, 521 conjugation, 395 Consequence Argument, 206 conserved function, 89 constant, 793 in topos, 828 constrained system, 426 constraint first class, 427 primary, 426 secondary, 426 Continuity of Probabilities (assumption), 222 continuous (cross-) section, 737 continuous bundle of C\*-algebras, 737 continuous field of states, 324 continuous function bounded, 522 vanishing at infinity, 522 with compact support, 522

continuous functional calculus, 104, 585, 658 Continuum Hypothesis, 119 convergence of filter, 640 convex cone, 668 convex hull, 27, 553 convex polytope, 28 convex subset, 27, 541 convolution product, 714 Copenhagen Interpretation, 3 coproduct, 813 core of operator, 182 correlated unit vector, 220 Correspondence Principle, 9 covariance algebra, 735 covariance condition, 735 cumulative hierarchy, 802 Curie–Weiss-model, 409 curve, 86 cyclic vector, 36 Daseinisation inner, 486 outer, 486 dcpo (directed complete partial order), 485 De Morgan's Laws, 75 Decoherence, 442 deduction theorem, 787 deformation quantization, 13 formal, 289 of Poisson manifold, 250 positive, 251 strict, 248 density operator, 40, 103, 508, 622 derivation, 85, 350 symmetric, 350 unbounded, 350 Determinism (assumption), 200, 203, 208, 212, 213, 215 deterministic theory contextual, 193 non-contextual , 193 diagonal, 821 diagram, 812

Dijksterhuis, E.J., ix dimension function, 753 Dirac, P.A.M., 1–3, 80, 248, 249, 259, 275, 276, 289, 291, 435 Dirac–Groenewold–Rieffel condition, 250 direct limit, 707 direct sum, 540 algebraic, 707 Hilbert spaces, 694 representations, 694 directed system, 707 dispersion-free, 46, 121 doctrine of classical concepts, vii, 3, 5, 6 Dominated Convergence Theorem, 525 Double Commutant Theorem, 742 double dual, 545 Dreimannerarbeit, 249, 289, 768 ¨ dual (of a normed space), 545 dual group, 715 dual morphism, 545 dynamics, 347 weak asymptotic abeliannnes, 379 ∃-elimination, 796 Earman's Principle, 12, 18, 310, 317, 330, 367, 370, 373, 375, 385, 415, 440, 441, 443, 451, 452 effect, 71, 125, 333 in C\*-algebra, 334 eigenspace, 500 eigenvalue, 500 degenerate, 500 joint, 504 multiplicity, 500 non-degenerate, 500 simple, 500 eigenvalue-eigenvector link, 317 eigenvector, 500 joint, 504 einselection, 443 Einstein locality, 318 Einstein summation convention, 419

Einstein, A., 1, 7, 9, 80, 204, 247, 249, 275, 439, 493, 803 Einstein–Podolsky–Rosen, 191, 202 embezzlement, 226 emergence, 12, 430 asymptotic, 367, 451 emergent features, 368 empirical measure, 307 energy, 355 interaction, 383 surface, 383 entourage uniformity, 765 entropy, 356 environment, 442 epimorphism (= epi), 814 equalizer, 821 equivalence relation as groupoid, 726 equivariance (of quantization), 295 essential range, 582 essential supremum, 525 Euclidean group, 256 evaluation map, 811 event, 23 expectation value, 25 explanatory emergence, 430 exponential, 811 exponential map, 93 external description, 822 extremal decomposition, 29 extreme boundary, 28, 553 extreme points, 28 Eyring–Kramers formula, 377

face, 553 factor, 747 finite, 750 purely infinite, 750 semifinite, 750 type classification, 753 false, 76, 475 fermionic Fock space, 390 field operator, 402 filter, 548, 781 neighbourhood, 640

prime, 548, 781 proper, 781 first-order logic, 793 flow, 86, 87 folium, 328 forcing, 792, 831 formula, 794 closed, 794 in topos, 829 open, 794 Fourier inversion formula, 720 frame, 277 O(*X*), 685 compact, 466 definition, 778 maps, 685 prime element, 690 regular, 466 spatial, 687 frame function, 64 free energy, 356 equilibrium, 356 free vector space, 697 Free Will Theorem, 202–209 Freedom (assumption), 201, 204, 208, 212, 213, 215, 217 frequency operator, 299 function affine, 557 concave, 557 continuous, 523 convex, 557 essentially bounded, 534 in ZF set theory, 801 integrable, 524 measurable, 523 of rapid decrease, 178 simple, 523 space, 811 strictly convex, 557 uniformly continuous, 765 function symbol, 793 in topos, 828 functional, 497, 538 sublinear, 541

functor, 807 contravariant, 807 covariant, 807 forgetful, 810 functors adjoint pair, 809 left-adjoint, 809 right-adjoint, 809 Godel's Completeness Theorem, 795 ¨ Godel's negative translation, 791, ¨ 796 Garding domain, 157 ˚ gap equation, 414 gauge (of convex set), 541 gauge transformation, 424 Gelfand duality, 652, 656, 809 Gelfand isomorphism, 648 constructive, 471 Gelfand spectrum, 26, 84, 648 Gelfand topology, 648 Gelfand transform, 648 Gelfand triple, 178 Gelfand's Theorem, 83, 648 Gelfand–Mazur Theorem, 673 generalized derivative, 180 generating set, 831 GHZ-Theorem, 210 Gibbs measure, 357 Gleason's Theorem, 61, 80, 119–122 Gleason-Kahane-Zelazko Theorem, 770 GNS-construction, 36, 691 Goldstone field, 420 Goldstone Theorem, 416–424 classical, 420 Gram–Schmidt procedure, 566 graph (of linear map), 540 group amenable, 734 as a groupoid, 259 icc (infinite conjugacy classes), 752 modular, 755 group action

asymptotically abelian, 346 group C\*-algebra reduced, 253 group von Neumann algebra, 751 groupoid, 259, 725 action, 726 category, 806 gauge, 291 Lie, 725 pair, 726 tangent, 726, 729 Holder inequality, 520 ¨ Haag duality, 318 Haar basis, 115 Haar measure, 152, 714 Haar system, 730 left, 730 left-invariant, 730 Hadamard's Lemma, 94 Hahn–Banach Theorem, 542 halving lemma, 122 Hamburger Moment Problem, 106 Hamhalter's Theorem, 335 Hamhalter–Dye Theorem, 763 Hamilton's equations, 88 Hamiltonian operator, 347 Hamiltonian vector field, 88 Hard determinism, 205 Hard incompatibilism, 205 Hausdorffication, 484 Heisenberg cocycle, 172 Heisenberg group, 92 Heisenberg model, 348 Heisenberg, W., 1–11, 18, 19, 80, 193, 249, 275, 276, 291, 311, 431, 435, 436, 438–440, 444, 450, 515, 769 Heisenbergification, 11 helicity, 272 Hellinger–Toeplitz Theorem, 569 Hepp's Lemma, 322 hereditary subalgebra, 688 hermitian form, 495 hermitian map, 530

Hewitt–Savage Theorem, 307 Heyting algebra, 686 complete, 686, 779 definition, 779 implication, 779 in topos, 824 hidden variable dispersion-free, 193 non-contextual , 193 normalized, 193 quasi-linear, 194 stochastic, 217 theory, 221 Higgs field, 424 Higgs mechanism, 424–429 Hilbert *A*-module, 678 Hilbert C\*-module, 678 pre, 677 Hilbert space, 1, 36 definition, 517 dimension, 566 finite-dimensional, 497 separable, 566 unit sphere, 498 Hilbert spaces isomorphic, 566 Hilbert, D., 1, 2, 13, 75, 638, 768, 803 homogeneity (of norm), 495 homomorphism anti, 764 Boolean lattices, 779 C\*-algebras, 646 Jordan, 763 lattices, 778 nondegenerate, 665, 682 order, 778 orthocomplemented lattices, 779 orthomodular lattices, 780 posets, 777 homomorphisms equivalent, 176 Husimi function, 252 hyperplane, 544

ideal closed two-sided, 671 essential, 681 in Banach algebra, 671 in lattice, 466 lattice, 782 left, 671 maximal, 671, 782 prime, 690, 782 proper, 782 regular (in lattice), 466 right, 671 idealization, 369 idempotents, 35 identities map, 806 image, 820 implication, 821 Imprimitivity Theorem, 263 Incompatibilism, 205 incomplete tensor product of Hilbert spaces, 313 indistinghuishable particles, 275–288 inductive limit, 707, 711 inequivalent quantizations, 280 infimum, 778 infinite tensor product of Hilbert spaces, 312 initial object, 813, 820 inner Poisson derivation, 96 inner product, 495 integrating curve (of vector field), 86 integration, 523 interaction, 348, 352 nearest-neighbour, 348 interior point, 541 internal language, 828 internal reasoning, 822 interpretation, 795 interval domain, 485 intuitionistic propositional logic, 790 invariant domain, 157 involution, 26, 645 isometric isomorphism, 540

isometry, 540 isomorphism Boolean lattices, 779 C\*-algebras, 646 Jordan, 763 natural, 808 objects, 807 orthomodular lattices, 780 posets, 777 weak Jordan, 335 Isotony, 318 Jacobi identity, 85 JB-algebra, 763 joint energy-momentum spectrum, 422 joint spectrum, 107 Jordan algebra, 33, 126, 334 definition, 763 Jordan map, 126 weak, 126 Jordan product, 33, 126 Jordan triple product, 764 Jordan's Theorem, 135, 145 Jordan–Wigner transformation, 391 Kadison's inequality, 763 Kadison's Theorem, 133 Kadison–Singer Conjecture, 58, 113–118 Kadison–Singer property, 113 Kaplansky's Density Theorem, 743 KMS-condition, 358, 359 Kochen–Specker Theorem, 121, 194 Krein–Milman Theorem, 553 Kripke semantics, 475 λ-conversion, 829 ladder operators, 158 Laplacian, 188 lattice, 778 Boolean, 779 complete, 778 conditionally complete, 486 distributive, 778 free distributive, 783

in topos, 823 modular, 779 orthocomplemented, 779 orthomodular, 779 lattice gas, 352 law of contradiction, 75 law of double negation, 75 law of excluded middle, 75, 780 Lebesgue covering dimension, 758 left translation, 152 Leibniz rule, 350 Libertarianism, 205 Lie algebra, 85 as Lie algebroid, 260 of a Lie group, 93 semi-simple, 170 simple, 170 Lie algebroid, 259, 260 integrable, 260 Lie bracket, 93 Lie group, 155 as Lie groupoid, 726 linear, 92 semi-simple, 172 simple, 172 Lie groupoid, 259 Lie's Third Theorem, 156 Lie–Poisson bracket, 97 Lie–Poisson manifold, 253 Lie–Poisson structure, 97 limit, 813 Lindenbaum (–Tarski) algebra, 788 local gauge group, 424 local observables, 318 local part, 710 local realism, 244 local sequence, 303 locale, 685 point, 686 spatial, 687 Locality (assumption), 204, 212, 213, 215 localization, 486 locally closed, 262 locally uniformly closed, 740

logical theory, 784, 795 completeness, 786 consistency, 789 many-sorted, 793 signature, 784, 793 soundness, 786 long-range order, 381 Lorentz group, 272 proper orthochronous, 272 lower set (in lattice), 466 lower subset, 819 lower-level theory, 367 Ludwig's Theorem, 135 Lusin's Theorem, 639 Mackey–Glimm dichotomy, 262 manifold as Lie algebroid, 260 as Lie groupoid, 726 map proper, 665 marginal distributions, 25 mass, 272 matrix mechanics, 1 mean-field theory, 409 homogeneous, 409 measure, 523 atomic part, 604 barycenter, 557 complex, 530 continuous part, 604 countable additivity, 523 finitely additive, 533 finitely additive bounded, 533 finitely additive signed, 533 Hahn–Jordan decomposition, 530 inner, 526 invariant, 263 outer, 526 projection-valued, 724 quasi-invariant, 263 signed, 530 total variation, 530 measure space, 523 σ-finite, 523, 527

atom, 604 completeness, 526 finite, 523 inner regular, 526 outer regular, 526 regularity, 526 standard, 604 measure spaces equivalent, 602 isomorphic, 602 measurement context, 193 measurement problem, 435–457 big, 441 birth, 435 Decoherence, 442 Heisenberg, 436 insolubility, 445 London and Bauer, 438 new formulation, 453 non-existence, 444 Pauli, 438 Schrodinger, 439 ¨ small, 441 solubility, 455 Swiss approach, 440 von Neumann, 437 measurement scheme, 446 preserving probabilities, 447 sound, 448 measures mutually singular, 530 meet-irreducible, 687 method of the highest weight, 164 metric, 516 metric space complete, 516 Mexican hat potential, 416 Milnor's exercise, 770 Minkowski inequality, 520 Minkowski–Weyl Theorem, 245 Mitchell–Benabou language, 828 ´ mixing (of states), 28 model, 789, 791, 795 binary, 789 Kripke, 792, 796

sound, 791 standard (PA), 797 standard (ZF), 802 modular lattice, 78 modus ponens, 786, 795 moment (of a measure), 106 momentum map, 96 Hamiltonian, 96 infinitesimally equivariant, 99 monoidal category, 773 monomorphism (= mono), 814 monomorphisms equivalent, 814 monotone complete, 111 Monotone Convergence Theorem, 525 morphism, 806 between Banach spaces, 538 multiplication, 806 multiplication operator, 37 multiplier, 168 exact, 168 multiplier algebra, 679 natural transformation, 808 Nature (assumption), 201, 204, 212, 214, 215, 217 Nelson operator, 188 Nelson's criterion, 188 Nelson's Lemma, 635 Neumann, J. von, viii, 1–3, 10, 15, 17, 38, 42, 75, 80, 81, 152, 153, 187, 191, 192, 194, 231, 289, 312, 313, 330, 435, 437, 438, 445, 448, 450, 457, 459, 515, 566, 570, 590, 625, 638, 639, 643, 645, 646, 753, 768–769 Newton's equation, 89 Newton, I., ix no signaling property, 219 Noether's Theorem, 100 non-contextuality, 56 Non-contextuality (assumption), 201 non-logical symbols, 784, 793

norm, 27, 495 cross, 700 operator, 498, 569 supremum, 512 trace, 509, 618 norm topology, 574 normal (functional), 744 normal bundle, 728 normalization, 30, 43 norms equivalent, 517 nowhere dense, 640 object, 806 objectification, 7 of pointer observable, 447 observable on a set with a transition probability, 33 observables, 27 old-fashioned vector field, 85 one-point compactification, 531, 664 Open Mapping Theorem, 539 operator, 497 absolute value, 509, 618 affiliated to a von Neumann algebra, 637 anti-linear, 128 anti-unitary, 128 bounded, 538, 569 closable, 177, 572 closed, 177, 571 closure, 177, 572 compact, 608, 609 density, 622 diagonalizable, 611, 612 domain, 570 essentially self-adjoint, 177, 573 finite rank, 610 Hilbert–Schmidt, 623 maximal, 598 multiplication, 571 norm-positive, 583 normal, 583 numerical range, 587

partial isometry, 510 polar decomposition, 510 positive, 507 pure contraction, 625 self-adjoint, 177, 569, 570 square root, 617 symmetric, 573 trace-class, 618 unbounded, 569 unitary, 506, 510, 566 order interval, 777 order isomorphism, 127 order parameter strong, 380 orthoclosed subset of a set with a transition probability, 32 subspace of vector space, 780 orthocomplement of a subset of a set with a transition probability, 32 orthocomplementation, 779 orthodoxy, 435 orthogonal complement, 499, 562 double, 562 orthomodular lattice, 78 orthonormal basis, 497 orthonormal subset of a set with a transition probability, 32 oscillation, 65 Outcome Independence, 218 outcome spaces, 448 pair groupoid, 259 Paradox of Probability, 310 Parameter Independence, 218 (assumption), 222 paraparticle, 278 parastate, 278 parastatistics, 276, 278 Parseval's equality, 565

partial isometry, 510, 750

partial order, 777 in topos, 823

linear, 668 partition, 277 partition function, 356 partition of unity, 705 Pauli matrices, 130 Pauli, W., 1, 80, 108, 435, 438 paving conjecture, 118 Peano Arithmetic, 793 axioms, 797 perfect anti-correlation, 215 perfect correlation, 203, 215 Peter–Weyl Theorem, 153 phenomenological theory, 367 Plancherel's Theorem, 773 Poincare group, 272 ´ proper orthochronous, 272 point (of frame), 491 Poisson algebra, 88 Poisson bracket, 88 Poisson derivation, 96 Poisson geometry, 84 Poisson manifold, 88 Poisson tensor, 89 polar decomposition, 510 polar of subset, 244 polarization identity, 496 Pontryagin dual, 173 Pontryagin duality, 720 Pontryagin Duality Theorem, 173 poset, 777 directed, 777 positive definite inner product, 495 metric, 516 norm, 495 Positive Operator Valued Measure, 74 positivity in C\*-algebra, 668 of map on *B*(*H*), 43 of map on *B*(*H*)sa, 43 of map on *C*(*X*), 30, 526 of quantization, 295 potential, 348 short-range, 349

POVM, 74 pre-inner product, 495 predicate logic, 793 predicate symbols, 793 predual, 744 preorder, 777 in topos, 822 presheaf, 815 on *X*, 815 representable, 816 Principle of General Tovariance, 493 Principle of the Identity of Indiscernibles, 275 Principle of Uniform Boundedness, 576 probability distribution on E (*H*), 72 on P(*H*), 59, 119 probability measure *K*-exchangeable, 306 finitely additive, 533 completely additive on P(*H*), 119 exchangeable, 306 finitely additive on P(*H*), 119 on P(*H*), 59, 119 on locale, 489 permutation-invariant, 306 probability space, 104, 523 non-commutative, 696 problem of outcomes, 448 problem of statistics, 446 product, 811 binary, 811 Product Extension (assumption), 222 projection, 499, 502, 573 atomic, 601 finite, 750 minimal, 750 projections, 125, 333 proof by contradiction, 787 proposition, 784 propositional logic, 784 pullback of a map, 90

of arrows, 812 pure state space, 31, 765 normal, 125 pure thermodynamics phase, 363 purely logical symbols, 784, 793 push-forward of a diffeomorphism , 90 of a filter, 640 pushout, 813 Q (lost Gospel), 242 quadratic form, 496 Quantum Bayesianism (= QBism), 436 quantum De Finetti Theorem, 301 quantum event, 40, 103 quantum Ising chain, 348 quantum Ising model, 348 quantum logic, 459 Birkhoff–von Neumann, 75–79, 81, 459 intuitionistic, 471–475 quantum probability distribution, 40 quantum random variable, 103 quantum spin systems, 318 quantum toposophy, 459 quantum-mechanical law of strong numbers, 314 quasi-linear, 121 quasi-local observables, 318 quasi-local sequence, 303 quasi-state, 61, 490 strong, 120 weak, 120 quasi-symmetric sequence, 300 Radon–Nikodym Theorem, 549 rather below (in lattice), 466 reading scale, 447 real numbers Dedekind, 461, 489 lower, 489 upper, 489 real rank, 758 reductio ad absurdum, 787 regular Lie group action, 262

regular polyhedra, 561 regular space, 83 relation, 777 relatively open, 262 representation, 36 admissible, 174, 176 cyclic, 691 induced, 263 irreducible, 153, 693 left-regular, 254 nondegenerate, 695 of C\*-algebra, 691 parafermionic, 285 primary, 319 skew-adjoint, 155, 188 super-admissible, 174 weakly contained, 734 representations disjoint, 319 equivalent, 164, 319 equivalent admissible, 176 quasi-equivalent, 319 unitarily equivalent, 693 resolution of the identity in a set with a transition probability, 32 resolvent, 577, 581 Riesz Lemma, 563 Riesz Representation Theorem, 526 Riesz–Frechet Theorem, 568 ´ right translation, 153 right-annihilator, 760 root, 166 positive, 166

σ(*V*,*W*)-topology, 546 σ-algebra, 523 σ-weak convergence, 512 σ-weak topology, 111, 512, 743 Sakai's Theorem, 744 Schatten–von Neumann ideals, 643 Schmidt Extension (assumption), 222 Schrodinger equation, 15, 247, 445, ¨ 446, 449, 515

time-dependent, 186 Schrodinger's Cat, vii, 79, 439, 449, ¨ 452, 453, 457 Schrodinger, E., 1, 2, 248, 249, 252, ¨ 439, 441, 451, 452 Schur duality, 277 Schur's Lemma, 153, 693 Schwartz space, 178 Scott topology, 485 second cohomology group of *G*, 168 self-adjoint operator maximal, 506 self-adjoint operators, 125, 333 self-adjointness (of quantization), 295 self-consistency equation, 414 semantic entailment, 34 semantic equivalence relation, 76 semantics Kripke–Joyal, 831 propositional logic, 784 semi-direct product, 256 regular, 268 semi-direct product algebroid, 260 semi-direct product groupoid, 259 seminorm, 178 seminorm (internal), 463 semiring, 532 fundamental lemma, 533 sentence, 794 in topos, 829 separating duality, 546 separation theorem, 544 sequentially complete, 575 sesquilinear form, 495 bounded, 576 set with a transition probability, 31 set-theoretic universe, 802 setting of experiment, 199 sheaf, 818 of continuous functions, 818 Sheffer stroke, 787 shift operator, 392 sieve, 815 maximal, 815

pullback, 815 simplex, 28, 561 Choquet, 560 SNAG-Theorem, 724 Sobolev Embedding Theorem, 182 source map, 806 space σ-compact, 527 as a groupoid, 259 compact, 83 Hausdorff (= *T*2), 83 hyperstonean, 748, 761 locally compact, 83 Polish, 641 scattered, 761 sober, 687 Stone, 748, 761, 780 stonean, 748, 761 totally disconnected, 780 totally separated, 761 spectral mapping property, 580, 659 spectral order, 486 spectral presheaf, 494 spectral projection, 501, 588 spectral radius, 578 formula, 578 spectral resolution, 500, 611 in a set with a transition probability, 33 spectral subspace, 501 spectral theorem for self-adjoint operators approximation by projections, 592 bounded measurable functional calculus, 591 continuous functional calculus, 590 for compact operators, 612 for unbounded operators, 633 multiplication operator, 596, 598 on finite-dimensional Hilbert space, 500 spectral theory, 515 spectrum, 500, 577, 581 Arveson, 757

Connes, 757 continuous, 582, 641 discrete, 582 joint, 504 point, 582 residual, 641 Spehner–Haake model, 453 spin, 160, 175, 272 spontaneous symmetry breaking, 345, 367–433 double well, 371–378 mean-field theories, 409–415 quantum spin systems, 379–385 state, 30 *K*-exchangeable, 301 π*i*-normal, 319 clustering, 322 coherent, 252, 371 correlated, 243 equilibrium, 345 ergodic, 365 Gibbs, 384 ground, 345, 350, 353, 355 infinitely exchangeable, 301 KMS, 359 local equilibrium, 356 macroscopic, 324 mixed, 31 normal, 109 on *B*(*H*), 43 on *B*(*H*)sa, 43 on *C*(*X*), 527 on *C*0(*X*), 84, 529 on C\*-algebra, 646 permutation-invariant, 301, 326 primary, 319 probability measure, 28 product, 243 pure, 31 quasi-free, 403 singular, 112 trivial at infinity, 366 uncorrelated, 243 state space, 28, 30, 763 normal, 112, 125

#### INDEX 879

normal pure, 333 normal total, 333 of *B*(*H*), 43 of *B*(*H*)sa, 43 of C\*-algebra, 334, 647 pure, 334, 647 states *b*-distinguishable, 446 Stone spectrum, 781 Stone's Representation Theorem, 780 Stone's Theorem, 184 Stone–Weierstrass Theorem, 555 strictly convex (normed space), 543 strong (operator) topology, 574, 742 strong continuity of group action, 344 structure constants, 97 subcategory, 807 full, 807 subfunctor, 817 subobject, 814 subobject classifier, 462, 814 in [Cop,Sets], 816 subrepresentation, 319 sup-norm (= supremum norm), 83, 522 support of function, 522 of measure, 557 supremum, 778 symmetric sequence, 299 symmetrization operator, 298 symmetry, 125 algebraic quantum theory, 333–366 Bohr, 127, 334 Jordan, 126, 334 Kadison, 126, 334 Ludwig, 127, 334 permutation, 275–288 property of metric, 516 quantum mechanics, 125–191 spatial translation, 346 spontaneously broken, 379

von Neumann, 127, 334 weak Jordan, 126, 334 weakly broken, 379 Wigner, 126, 334 symmetry group, 345 symplectic manifold, 89 system of imprimitivity, 258 tangent bundle (as Lie algebroid), 260 target map, 806 tautological functor, 464 tautology, 785 tempered distribution, 178 tensor category, 773 tensor product algebraic, 697 C\*-norm, 700 cross-norm, 700 injective, 700 maximal C\*-norm, 701 product state, 703 projective, 701, 772 spatial, 243 state, 702 term, 794 term formation, 794 terminal object, 461, 811 in [Cop,Sets], 815 terms in topos, 828 tertium non datur, 75 theorem, 786, 795 theorem of the highest weight, 166 theory fundamental, 367 higher-level, 367 reduced, 367 reducing, 367 time-evolution, 345 Tomita–Takesaki Theorem, 755 Tomita–Takesaki theory, 754 top element, 777, 820 in internal lattice, 824 topological vector space, 178, 541

locally convex, 544 topos definition, 815 elementary, 815 topos theory and quantum logic, 459–494 introduction to, 805–833 total set of states, 115 trace, 508, 751 finite, 751 infinite, 751 semifinite, 751 transition probability, 31 on *P*(*B*(*H*)), 47 on pure state space, 765 triangle inequality metric, 516 norm, 495 true, 76, 475, 802 truth at stages, 831 in topos, 831 partial, 831 truth function, 785 truth object, 461, 814 in [Cop,Sets], 815 truth table, 785 tubular neighbourhood theorem, 727 twist map, 823 two-point function, 217 ultrafilter, 548, 781 free, 548 principal, 548 ultraweak topology, 743 unbounded multiplier, 681 uncorrelated unit vector, 220 uniform space, 765 uniform structure, 765 unilateral shift, 309 unit, 463 unital commutative C\*-subalgebra, 125, 333 maximal, 506 unitary dual, 164

unitary gauge, 426 Unitary Invariance (assumption), 222 unitary operator, 125 unitary representation, 151 unitization, 660 universal generalization, 795 upper semicontinuous partition, 336 upper set (= up-set), 819 upward directed, 759 Urysohn's Lemma, 639 valuation, 491, 784 vanishing at infinity, 83 variable, 793 bound, 794 free, 794 in topos, 828 variance, 25 vector cyclic, 595, 691 separating, 595 vector bundle as Lie groupoid, 726 vertex, 813 von Neumann algebra, 2, 590, 742 abelian, 747 center, 318 definition, 742 factor, 747 hyperfinite, 754 injective, 754 maximal commutative, 2 standard form, 755 von Neumann chain, 437 wave mechanics, 1 weak (operator) topology, 574, 742 weak convergence in *B*(*H*), 574 weak measurability, 152 weak topology, 546 weak∗ topology (= *w*∗-topology), 546 weight, 164 dominant, 165 of a frame function, 65

regular, 165 well inside (in lattice), 466 well-formed formula, 784 Weyl chamber, 165 Weyl group, 164 Weyl operator, 154 Weyl quantization, 251 Weyl's Program, 259, 289 Weyl, H., 18, 68, 172, 188, 251, 289, 290, 515, 583 Whitehead's Lemma, 170 Wigner cocycle, 265 Wigner function, 251 Wigner's Theorem, 132, 147

Wigner, E., 19, 187, 289, 290, 440, 442, 450, 457 would-be Goldstone boson, 424

yes-no questions, 35 Yoneda embedding, 816 Yoneda Lemma, 816 Young diagram, 277 Young tableau, 277 standard, 277

Zariski topology, 690 Zermelo–Fraenkel set theory, 793 ZF-axioms, 798